Lumpers and splitters are opposing factions in any discipline that has to place individual examples into rigorously defined categories. The lumper–splitter problem occurs when there is the desire to create classifications and assign examples to them, for example schools of literature, biological taxa and so on. A "lumper" is an individual who takes a gestalt view of a definition, and assigns examples broadly, assuming that differences are not as important as signature similarities. A "splitter" is an individual who takes precise definitions, and creates new categories to classify samples that differ in key ways.
The earliest known use of these terms was by Charles Darwin, in a letter to Joseph Dalton Hooker in 1857: It is good to have hair-splitters & lumpers. They were introduced more widely by George G. Simpson in his 1945 work The Principles of Classification and a Classification of Mammals. As he put it:
... splitters make very small units – their critics say that if they can tell two animals apart, they place them in different genera ... and if they cannot tell them apart, they place them in different species. ... Lumpers make large units – their critics say that if a carnivore is neither a dog nor a bear, they call it a cat.
Reference to lumpers and splitters in the humanities appeared in a debate in 1975 between J. H. Hexter and Christopher Hill, in the Times Literary Supplement. It followed from Hexter's detailed review of Hill's book Change and Continuity in Seventeenth Century England, in which Hill developed Max Weber's argument that the rise of capitalism was facilitated by Calvinist Puritanism. Hexter objected to Hill's "mining" of sources to find evidence that supported his theories. Hexter argued that Hill plucked quotations from sources in a way that distorted their meaning. Hexter explained this as a mental habit that he called "lumping". According to him, "lumpers" rejected differences and chose to emphasize similarities. Any evidence that did not fit their arguments was ignored as aberrant. Splitters, by contrast, emphasised differences, and resisted simple schemes. While lumpers consistently tried to create coherent patterns, splitters preferred incoherent complexity.
The categorization and naming of a particular species should be regarded as a hypothesis about the evolutionary relationships and distinguishability of that group of organisms. As further information comes to hand, the hypothesis may be confirmed or refuted. Sometimes, especially in the past when communication was more difficult, taxonomists working in isolation have given two distinct names to individual organisms later identified as the same species. When two named species are agreed to be of the same species, the older species name is almost always retained dropping the newer species name honoring a convention known as "priority of nomenclature". This form of lumping is technically called synonymization. Dividing a taxon into multiple, often new, taxa is called splitting. Taxonomists are often referred to as "lumpers" or "splitters" by their colleagues, depending on their personal approach to recognizing differences or commonalities between organisms.
In history, lumpers are those who tend to create broad definitions that cover large periods of time and many disciplines, whereas splitters want to assign names to tight groups of inter-relationships. Lumping tends to create a more and more unwieldy definition, with members having less and less mutually in common. This can lead to definitions which are little more than conventionalities, or groups which join fundamentally different examples. Splitting often leads to "distinctions without difference", ornate and fussy categories, and failure to see underlying similarities.
For example, in the arts, "Romantic" can refer specifically to a period of German poetry roughly from 1780–1810, but would exclude the later work of Goethe, among other writers. In music it can mean every composer from Hummel through Rachmaninoff, plus many that came after.
Software engineering often proceeds by building models (sometimes known as model-driven architecture). A lumper is keen to generalize, and produces models with a small number of broadly defined objects. A splitter is reluctant to generalize, and produces models with a large number of narrowly defined objects. Conversion between the two styles is not necessarily symmetrical. For example, if error messages in two narrowly defined classes behave in the same way, the classes can be easily combined. But if some messages in a broad class behave differently, every object in the class must be examined before the class can be split. This illustrates the principle that "splits can be lumped more easily than lumps can be split".
There is no agreement among historical linguists about what amount of evidence is needed for two languages to be safely classified in the same language family. For this reason, many language families have had lumper–splitter controversies, including Altaic, Pama–Nyungan, Nilo-Saharan, and most of the larger families of the Americas. At a completely different level, the splitting of a mutually intelligible dialect continuum into different languages, or lumping them into one, is also an issue that continually comes up, though the consensus in contemporary linguistics is that there is no completely objective way to settle the question.
Splitters regard the comparative method (meaning not comparison in general, but only reconstruction of a common ancestor or protolanguage) as the only valid proof of kinship, and consider genetic relatedness to be the question of interest. American linguists of recent decades tend to be splitters.
Lumpers are more willing to admit techniques like mass lexical comparison or lexicostatistics, and mass typological comparison, and to tolerate the uncertainty of whether relationships found by these methods are the result of linguistic divergence (descent from common ancestor) or language convergence (borrowing). Much long-range comparison work has been from Russian linguists belonging to the Moscow School of Comparative Linguistics, most notably Vladislav Illich-Svitych and Sergei Starostin. In the United States, Greenberg's and Ruhlen's work has been met with little acceptance from linguists. Earlier American linguists like Morris Swadesh and Edward Sapir also pursued large-scale classifications like Sapir's 1929 scheme for the Americas, accompanied by controversy similar to that today.
Paul F. Bradshaw suggests that the same principles of lumping and splitting apply to the study of early Christian liturgy. Lumpers, who tend to predominate, try to find a single line of texts from the apostolic age to the fourth century (and later). Splitters see many parallel and overlapping strands which intermingle and flow apart so that there is not a single coherent path in development of liturgical texts. Liturgical texts must not be taken solely at face value; often there are hidden agendas in texts.
The Hindu religion is essentially a lumper's concept, sometimes also known as Smartism. Hindu splitters, and individual adherents, often identify themselves as adherents of a religion such as Shaivism, Vaishnavism, or Shaktism according to which deity they believe to be the supreme creator of the universe.
Physicist and philosophy writer Freeman Dyson has suggested that one can broadly, if over-simplistically, divide "observers of the philosophical scene" into splitters and lumpers - roughly corresponding to materialists (who imagine the world as divided into atoms) and Platonists (who regard the world as made up of ideas).
In psychiatry, the
'splitters' and the 'lumpers' have fundamentally different approaches to psychiatric diagnosis and classification. First, 'splitters' emphasise the heterogeneity within the diagnostic categories and argue that this heterogeneity drives the 'splitting' process'. 'Lumpers', on the other hand, point to the similarities between the diagnostic categories, and suggest that these similarities justify the creation of broader entities.
The 'splitters' and the 'lumpers' have fundamentally different approaches to psychiatric diagnosis and classification. First, 'splitters' emphasise the heterogeneity within the diagnostic categories and argue that this heterogeneity drives the 'splitting' process'. 'Lumpers', on the other hand, point to the similarities between the diagnostic categories, and suggest that these similarities justify the creation of broader entities.