Create complete phoneme databases of historical languages
Verfasst: 01.01.2024 11:14
- IMPORTANT: Please note that the author's disclaimer applies to all content posted by the author in this topic
applies. You can find this Disclaimer attached to the author's first post in this topic. The article is currently being edited and may still contain errors. -
Create complete phoneme databases of historical languages
With relative effort and simple means, it is possible to create databases for language research that expand understanding of languages that have been lost or are still researched.
In this topic I would like to explain step by step how, with appropriate (manual) effort, databases can be created, for example for phonemes, syllables, pronunciation, etc. created using conventional office software (complex spreadsheet documents).
The purpose of this project is to demonstrate one area of possibilities in which mathematics (combinatorics) can support language research. This is based on the assumption that a database with, for example, all possible linguistic and textual expression options can expand and enhance one's own engagement with a researched language. For reasons of effort, the scope of this project must be limited to texts or text modules with very few placeholders (e.g. phonemes or syllables). However, the principle can be expanded accordingly with appropriate technical know-how or human resources.
The connections to be described below are equally suitable for gaining an insight into the combinatorics that are useful for deciphering texts and is therefore necessary (cryptological basics). Since the overall topic is very complex, I needed several attempts here in the forum to find a suitable form for this context.
The Sumerian language and its associated basic phonemic inventory
(This topic cannot possibly provide an overview of the overall aspects of the Sumerian language. Please refer to the relevant sources listed in the bibliography, for example.)
The Sumerian language is well suited to provide a formal introduction to the topic of this post: the phoneme inventory of the Sumerian language has (to my knowledge and with reference to the sources I used) four vowels (/a e i u/) opposite 16 consonants, which makes the Sumerian language "quite simple" [gwiki1]. The advantage of this combination and number of vowels and consonants - from a mathematical point of view - is that a combination of 4 + 16 elements can be viewed in a very clear combinatorial manner - and therefore easy to represent. This helps to explain mathematical and combinatorial (cryptological) connections on the topic well and clearly.
Combinatorial basics for creating databases for e.g. phonemes or syllables of a language
From a combinatorial perspective, the (possible and special principle described here) of creating a database - e.g. for phonemes or syllables - of a language (explained here using the selected example of the Sumerian language) can be explained as follows. The procedure is split into various successive basic steps. These steps can all be done either by hand or in the form of computer table documents (please be sure to note the warnings in the disclaimer!):
Step 1:
In order to create the database, the analyzed language and - ideally a completely known phoneme inventory* - must first be determined. This definition also determines the number of elements to be mathematically (combinatorically) combined with one another as well as the resulting “chain combinations”.
Due to the scope required, I will only address the special case of (to our current knowledge) "completely known" language phoneme inventories in passing. For the sake of simplicity, at this point in the discussion I am assuming that the phoneme inventory of the Sumerian language known to us today - to our current knowledge - has been fully and completely deciphered (However, as an actual amateur and non-linguist, I could be wrong about this; my main focus here is on mathematical questions about the structure and deciphering of languages. In order to be able to explain the relevant procedures well, I need a database that is as clearly clarified as possible.).
If I continue to deal with the Sumerian language and its phoneme inventory, for the sake of simplicity (in order to be able to explain the entire procedure described here well) I will assume that the Sumerian language actually has a total of 20 phonemes (4 vowels and 16 associated ones).
Step2:
The specified number of elements is combined in chain combinations until the desired number of placeholders (more on this later) lined up one after the other is achieved: in order to get an introduction to the topic, In the following, I will initially use a smaller number of elements to explain the respective principle for combining elements that can be combined with each other (e.g. phonemes or e.g. also syllables) - there are different methods for this. In order to understand how the elements are combined with each other and what mathematical results result from this, it is necessary to deal with the mathematical concept of the Cartesian product (more on this later) and later with a principle - also in direct comparison with the principle of the Cartersian product - that I call the "telephone book principle".
Step3:
If all elements of a project (e.g. phonemes or syllables) - if technical; e.g. manually; possible - have been successfully combined with one another, the results can be combined into a specific database (more on the benefits and use of such a database later).
About the Cartesian product
The term Cartesian propuct is a term from the (so-called "naive") set theory (according to Cantor) [Reiss/Schmieder, 2007]; [gwiki2]. The Cartesian product describes the “crossed” (simple) combination of elements of two sets. The Cartesian product can be clearly explained using two equal sets (i.e. two sets that contain exactly the same elements). If we look at the sets A, B with A = {1, 2, 3} and B = {1, 2, 3}, we get a good basis for explaining the Cartesian product. At this point it´s necessary to explain that a mathematical set in the sense of set theory can contain an unordered specific number of elements. However a tuple is commonly spoken of in the mathematical sense when we consider an ordered number of elements: a set can therefore contain both an ordered and an unordered number of elements, a tuple always contains a number of ordered elements. For example, in the field of computer science, the ordered consideration of numbers of elements (tuples) is often used [Reiss/Schmieder, 2007].
For the combinatorial creation of phoneme databases for languages, for example, it makes sense, but is generally not important, whether and how we organize the phonemes of the phoneme inventory (here using the example of the Sumerian language with 4 vowels and 16 associated consonants): we can therefore use sets or tuples, just like we prefer it: In the context in which we define our own definition of order for a basic data base of elements, this order in the mathematical sense corresponds - in my opinion - not to a tuple (depending on the definition), but in any case to a specifically ordered set. However, due to these mathematical pitfalls when it comes to defining an "ordered number of elements", it is easier for me to choose mathematical ordering systems that are as close as possible - e.g. according to the order of numbers or letters in specific systems - so that I can call such ordered sets "tuples".
This step fulfills a purpose - which I will discuss later - but which does not change the overall result of "chain combinatorics": this is because in this topic I only explain the creation of complete element databases, i.e. all elements are combined with each other until all possible combinations have been exhausted. The orderly consideration of elements to be combined therefore fulfills its purpose more in a simplified mathematical consideration and in the subsequent possible assignment of elements in the form of an index (more on this later).
The purpose of such an index - as far as I know so far (which I must consciously formulate as a conjecture here) - is that "linguistic peculiarities", e.g. those of a specific author of historical documents, can be better assigned, among other things.
Elevating a defined number of elements to a Cartesian product means, so to speak, "crossing" every single element with every other element (e.g.) of a set (here this term is meant exclusively in the strictly mathematical sense).
We can therefore write in simplified terms, e.g.:
Cartesian product from A to B or A to A(1) or in the notation I use from A(0) to A(1) = AχB
Applied to a simple ("two-dimensional") tabular structure, this means that every element of the set/tuple A(0) is combined with every element of the set/tuple A(1). Important for "language research" in the sense of creating phonetic databases, for example, is to aim for A(0) = A(1) (the exact same elements should be contained in set/tuple A as in set/tuple A( 1). This condition does not necessarily have to be met in the sense of the topic described here, but it makes combinatorial work much easier. The result of applying this principle is therefore a very good, clear "data economy" (in the sense of "data economy"). This type the approach actually saves (experience has shown) time and concentration (when working), e.g.:
A(0) = {1, 2, 3} and A(1) = {1, 2, 3}
You can also write: "The sets A and B are equally powerful (because each of the two sets contains exactly the same elements)". Furthermore, when considering tuples it could be written: "Tuple A and tuple B have the same power and contain exactly the same elements with exactly the same number of elements.", e.g.:
(x1 ... x3)(0) = {1, 2, 3}
(x1 ... x3)(1) = {1, 2, 3}
This connection can be presented in a table, for example, as follows (here in notation with tuples) (see appendix 1).
(Note: The separation of elements with commas does not represent comma numbers in the sense of German spelling, but rather a list of elements that belong to one another.)
The elements ordered (here according to a certain scheme) arise from A(0)XA(1) as:
A(0)XA(1) = {(1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3)}
which therefore represent the Cartesian product of A(0) and A(1).
Upon closer analysis of the Cartesian product of A(0)XA(1), two particularly noteworthy aspects emerge:
The table can be mathematically referred to as a square matrix or "matrix". Because the number space of natural numbers can be described with the properties of "multidimensionality", I also refer to such a matrix as "two-dimensional".
In the matrix, two reflecting areas (across the main diagonal of the matrix) can be identified, as well as the elements located on the main diagonal. If the matrix is broken down into cell contents in the form of a table and described, a matrix structure is created that can be divided into 3 main areas. I basically call these areas according to the areas: alpha (α), beta (β), gamma (γ) (see appendix 2). This classification and the names I have chosen here should in no way be viewed as fixed. Mathematical expressions are always open to discussion and this method of representation and naming is only an example. This classification and chosen name represents one of many different options and serves to better describe the matrix contents. In the following it is now possible to formally describe the matrix and its contents:
R(mxn) is a squarematrix if m = n (more on this later).
Applied to the possible combination of phonemes, for example, the principle of the Cartesian product with its specific results should not be confused with the following combinatorial notation, in which all elements of a set are crossed with all elements of the same set and represented in all possible notations ("dictionary notation" or "telephone book notation"), at:
{(a, b, c) ∈ A, (a, b, c) ∈ B}
A = {a, b, c}
B = {a, b, c}
linear combination options for A, B:
1, 1, 1 or a, a, a [line I]
1, 1, 2 or a, a, b [line II]
1, 1, 3 or a, a, c [line III]
1, 2, 1 or a, b, a [line IV]
1, 2, 2 or a, b, b [line V]
1, 2, 3 or a, b, c [line VI]
1, 3, 1 or a, c, a [line VII]
1, 3, 2 or a, c, b [line VIII]
1, 3, 3 or a, c, c [line IX]
2, 1, 1 or b, a, a [line X]
2, 1, 2 or b, a, b [line XI]
2, 1, 3 or b, a, c [line XII]
2, 2, 1 or b, b, a [line XIII]
2, 2, 2 or b, b, b [line XIV]
2, 2, 3 or b, b, c [line XV]
2, 3, 1 or b, c, a [line XVI]
2, 3, 2 or b, c, b [line XVII]
2, 3, 3 or b, c, c [line IXX]
3, 1, 1 or c, a, a [line XX]
3, 1, 2 or c, a, b [line XXI]
3, 1, 3 or c, a, c [line XXII]
3, 2, 1 or c, b, a [line XXIIII]
3, 2, 2 or c, b, b [line XXIV]
3, 2, 3 or c, b, c [line XXV]
3, 3, 1 or c, c, a [line XXVI]
3, 3, 2 or c, c, b [line XXVII]
3, 3, 3 or c, c, c [line XXVIII]
This notation is also of great importance for combintorics in the sense of "linguistic research" and e.g. deciphering historical texts with their mathematically resulting possibilities - and to say it in advance: it is the completely complete, but extremely complex method. With the small example of 3 elements to be combined, it produces a different number of combination possibilities, namely 3*3*3 = 3^3 = 27 possibilities, while elevating comparatively 4 elements to the Cartesian product to the Cartesian product (here "first stage", or 1st evolution) results in a number of 4*4 combinations = 16 combinations (when using 4 elements, basic base of elements that are raised to the Cartesian product).
The differences and effects to generate combinations between these two essential methods will be discussed later.
In short, what you may have already noticed: A Cartesian product (of two numbers of elements combined with each other always produces a square number of possible combinations, while the so-called “dictionary notation” always produces a number of possible combinations the size of a proportional number according to the principle of powers, e.g.:
Table of resulting combinations depending on the used principle:
comb = "combinations"
CP = Cartesian Product (here always "squared" Cartesian product)
TBP = Telephone Boon Principle
x = number of elements contained in the base of elements to be combined
[x] / [comb. CP] / [comb. TBP] // [line]
[1] / [1^2 = 1] / [1^1 = 1] // [line]
[2] / [2^2 = 4] / [2^2 = 4] // [line]
[3] / [3^2 =9] / [3^3 = 27] // [line]
[4] / [4^2 = 16] / [4^4 = 256] // [line]
[5] / [5^2 = 25] / [5^5 = 3125] // [line]
[6] / [6^2 = 36] / [6^6 = 46656] // [line]
[7] / [7^2 = 49] / [7^7 = 823543] // [line]
etc.etc.
As is clear from the specifically proportionally strong increase in results when using the (maximally complete, i.e. gap-free) method of the telephone book principle, one of the main objectives is to have databases that are as complete as possible for, for example, phonemes or syllables - or for the encoding or deciphering of texts, for example Number of possible combinations to analyze
as much as possible (and as far in advance of a project as possible). The application of the principle of the Cartesian product, among other essential methods, is suitable for exactly this purpose; set at a specifically sensible time in a project and of course used at the appropriate point.
The characteristic that a Cartesian product (with a single exception which will be discussed later - exclusively in a square matrix, i.e. with A = B) always generates a square number of possible combinations, is important in terms of data economy when creating complete tabular data lists (here databases with e.g. phonemes or syllabic databases) have a big advantage, which I will explain below.
First of all: both methods offer their own specific advantages and possible applications, which depend on what is to be achieved. Both methods can be combined with each other if this makes sense. However, I will discuss the different methods (and the possible combination of the two dirrefent methods in specific areas) one after the other and initially concentrate on the method of determining combinations using the Cartesian product principle.
Determine chain combinations by creating and combining Cartesian products
The big and special advantage for determining combinations (chain combinations) using the principle of the Cartesian product is that Cartesian products when applied to square matrices always output a square number of possible combinations, specifically sorted (please note the only exception here: I will explain further below). This connection can be used up to a certain resource-related limit when manually determining (in this case in tabular form) phoneme or syllable combinations (as well as letter combinations, for example). “Resources” here mean factors such as writing and computing time and the time that you are able to devote to concentrated work (which can of course vary greatly from person to person).
In order to better explain what this advantage is, a short excursion into the world of so-called figurable numbers and thus into the world of triangular numbers and square numbers as well as the Gaussian formula for triangular numbers is necessary.
As the young Gauss (1777 - 1855) [gWiki8], one of the most famous mathematicians in the world today and probably of all time, recognized at a young age, square numbers in the number space of natural numbers can always be formed according to the same scheme by combining two directly successive triangular numbers. Even if it is strongly assumed that the ancient Greeks, for example, were aware of this connection, Gauss is considered to be the first who verifiably published about this and summarized the principle in a formula [duSautoy,39].
The principle of forming square numbers can be explained relatively simply using two series of numbers (triangular numbers) placed next to each other. The formation of the triangular numbers in turn follows their very own - characteristic - recursivity, whereby the recursivity of both number series; of triangular numbers and square numbers; can be conclusively related to each other in a tabular overview of number series (this represents one of a few possible structural proofs, about which I cannot judge at this point, based on my current knowledge, to what extent they are generally known).
Squarenumbers and their formation:
at:
ℕ = {0, 1, 2, 3, 4, 5, 6, 7, ...}
x² = squarenumbers
x² ∈ ℕ = {0, 1, 4, 9, 16, 25, 36, 49 ...}
▲ = triangular numbers
▲ ∈ ℕ = {0, 1, 3, 6, 10, 15, 21, 28, 36 ...}
Evolution of triangular numbers in ℕ:
0+1 = 1
1+2 = 1
1+2+3 = 6
1+2+3+4 = 10
1+2+3+4+5 = 15
1+2+3+4+5+6 = 21
1+2+3+4+5+6+7 = 28
1+2+3+4+5+6+7+8 = 36
alternatively you can write e.g:
▲(1) = ∑(0...1) = 1
▲(2) = ∑(0...2) = 3
▲(3) = ∑(0...3) = 6
▲(4) = ∑(0...4) = 10
etc.etc.
Another way I use for triangular numbers is:
▲∑(1) = 1
▲∑(2) = 3
▲∑(3) = 6
▲∑(4) = 10
etc.etc.
Evolution of square numbers in ℕ:
0 + 1 = 1
1 + 3 = 4
3 + 6 = 9
6 + 10 = 16
10 + 15 = 25
15 + 21 = 36
etc.etc.
The developmental recursivity of triangular numbers and square numbers:
As a structural proof of the recursivity of triangular numbers and square numbers, the following connection essentially represents a modified excerpt from Pasqual's triangle (see Appendix 3).
The graphical representation is suitable for clarifying the structural connection between triangular numbers and square numbers (see Figure 4; slight distortions in the representation are possible due to screen technology and formatting technology, so that the figures shown are not 100% correctly displayed as square figures).
Developmental connection in the defined matrix between triangular numbers and square numbers
The graphical representation shows how triangular numbers and square numbers can be viewed and defined quantitatively in their mutually dependent, definable evolution: a quantitative part of the larger specific triangular number, which forms a correspondingly specific square number with a preceding specific triangular number, can be quantitatively assigned to the main diagonal of one square matrix and thus the elements lying on the main diagonal of a specific square matrix means: the elements located in the square matrix previously defined here with the specific designation (gamma) can be set theoretically 8depending on the point of view) either of the element set (alpha or the set of elements (beta) can be assigned in the matrix. This allows statements about the formation of specific sets to be described in a simplified manner. However, in view of the benefit of the defined matrix, set theory makes it easier to identify the elements located on the main diagonal of the defined specific matrix the fraction (gamma) should be viewed as an independent set.
With regard to the formation of square numbers from directly successive specific triangular numbers, it makes sense to introduce a further consideration of the square figure: a specific square figure with side length x can be designed with x² elements (e.g. square pieces of cardboard or mussel shells arranged in a square). This definition establishes the connection to the figured numbers, which was already intensively researched and discussed by the ancient Greeks. By referring to the figurable numbers, it can also be directly transferred to calculating with the matrix defined here:
Set-theoretically, of a number of x² elements (which are each specifically formed from the sets Alpha, Beta, Gamma), a number of to express how square numbers can be formed from triangular numbers. Furthermore, further fundamental statements about square numbers (and thus square figures) are possible, which can be directly transferred to the handling of the matrix defined here:
statement: the elements contained in the set Alpha+Beta+Gamma of a square matrix can always be divided according to the following key if we define and view the square matrix as a (square) Cartesian product: the elements contained in the Alpha and Beta fraction regions of the matrix reflect each other in a definable way, while the elements in the Gamma fraction region each have a specific unique selling point if we consider the elements of the Matrix as sets and their elements. This results in set theory:
This set-theoretic connection can be used to create, for example, phonemic or syllabic databases in order to achieve the most efficient data economy.
EXAMPLE:
If a number of 4 different elements are raised to form a (square) Cartesian product, the following possible combinations arise, e.g.:
A(0), A(1) = {1,,2 3, 4}
A(0) = A(1)
resulting combinations (in a specially defined sorting):
(see appendix 6)
{A(0)XA(1) ∈ ℕ / (a, b, c, d), (a = 1, b = 2, c = 3, d = 4), (a, b, c, d ∈ A(0), a, b, c, d ∈ A(1) / A(0)XA(1) = (Α,Β,Γ)}
A(0)XA(1) ∈ ℕ - beta = {(1,1), (2,1), (3,1), (4,1), (2,2), (3,2), (4,2), (3,3), (4,3), (4,4)}
When (manually) creating databases for, for example, phonemes and syllables of a specific language, a - specific - considerable amount of data (according to a fixed calculation key) can be saved if the "backward reading" option is used.
In this context, "reading backwards" is actually meant in the literal sense: with the appropriate disposition or experience and training, it is possible to recognize the meaning of text even in texts written backwards. Experienced readers can make use of this effect when creating, for example, phoneme databases based on the principle of using the Cartesian product of basic data sets.
Creating a database (here as an example) for phonemes or syllables, for example, using the principle of the Cartesian product works according to the method described below:
Create a database manually, e.g. for phonemes or syllables, using the principle of the Cartesian product:
In order to create a database manually, for example for phonemes or syllables, using the principle of the Cartesian product, it is necessary to implement the following steps for the principle of creating a database described here:
Step 1: Define the basic inventory of phonemes or syllables.
Step 2: Collect elements of the basic inventory to form a (square) Cartesian product.
Step 3: Eliminate unnecessary data when using the “read backwards” option or save it from the start.
(Note: Steps 2 and 3 are repeated frequently depending on the desired combination result in the sense of "chain combinations" - more on this later.)
To the individual steps:
Step 1: Define the basic inventory of phonemes or syllables
For the following application example, I define a selected (here fictitious) basic inventory of syllables. The individual syllables as elements of the set A(ß) with altogether 4 different elements are here:
1 = bo
2 = kun
3 = tun
4 = tos
A(0) = {bo, kun, tun, tos}
The individual syllables can optionally be represented as numbers in a square matrix. If A(0) and A(1) are now raised to the Cartesian product at A(0) = A(1), the corresponding (all) possible combinations of the syllables arise in the resulting square matrix (see appendix 7 and 8).
How useful this approach can be for creating databases with complete combination numbers of, for example, phonemes or syllables becomes clear when (also fictional) translations (here in the example into English) are assigned to the fantasy syllables, e.g.:
1 = bo = "boat"
2 = kun = "far"
3 = do = "wide water"
4 = tos = "move"
The (complete and gap-free combinations) created in the square matrix as a (square) Cartesian product can be compared to the reduced set of elements: it becomes clear here that the textual contexts of meaning can also be grasped (with appropriate skills or training), if they are recorded reading backwards:
Overall combinations determined in the example used here for four different (autonomous) syllables (here also “solitary”):
(listed in the specific sorting selected here)
Basic inventory of defined solitaires (here syllables) and their assignment:
1 = bo = "boat"
2 = kun = "far"
3 = tun = "wide water"
4 = tos = "move"
Resulting syllable constellations in A(0)XA(1) (squared Cartesian Product):
centered:
I: bo,bo
II: kun,kun
III: tun,tun
IV: tos,tos
- - -
{regular} (centered) [mirrored]
I: {kun,bo} [bo,kun]
II: {tun,bo} [bo,tun]
III: {tos,bo} {tun,kun} [kun,tun] [bo,tos]
IV: {tos,kun} [kun,tos]
V: {tos,tun} [tun,tos]
Resulting syllable constellations in A(0)XA(1) (squared Cartesian Product) "translated":
centered:
I: boat,boat
II: far,far
III: wide water,wide water
IV: move,move
- - -
{regular} (centered) [mirrored]
I: {far,boat} [boat,far]
II: {wide water,boat} [boat,wide water]
III: {move,boat} {wide water,far} [far,wide water] [boat,move]
IV: {move,far} [far,move]
V: {move,wide water} [wide water,move]
The meaning of this combinatorial and combination-eliminating approach becomes clear when we consider the data economic effect of this approach in chain combinations.
Chain combinatorial data economics in (square) Cartesian products
The previous explanations have shown that when determining the number of combinations in square matrices using the principle of (square) Cartesian products, elements (and thus data) can be eliminated if the "read backwards" option is used in the result output. This “data economy” or “data efficiency” can be precisely calculated in terms of an “ideally efficient data yield”. The corresponding calculation options for triangular numbers and square numbers are used for this purpose. These relationships can be clearly illustrated in a table overview:
Data economic yield when applying the read-back option in square Cartesian products:
generally for squarematrixes A(0)XA(1); "elements" = e.g. phonems or syllables
Tabloe-columns:
[C1] = [elements; x]
[C2] = [combinations; x²]
[C3] = [data saving (combinations; proportionate)]
[C4] = [data saving (combinations; fractional)]
[C5] = [data saving (combinations; percent)]
[C1] / [C2] / [C3] / [C4] / [C5] // [line]
[x] / [x²] / [prop.] / [fract.] / [%]
- - - - - - - - - - - - - - - - - - - - - - - - - -
[1] / [1] / [0] / [-] / [0.00%] // [line I]
[2] / [4] / [1] / [1/4] / [25.00%] // [line II]
[3] / [9] / [3] / [3/9] / [33.33p%] // [line III]
[4] / [16] / [6] / [6/16] / [37.50%] // [line IV]
[5] / [25] / [10] / [10/25] / [40.00%] // [line V]
[6] / [36] / [15] / [15/36] / [41.66p%] // [line VI]
[7] / [49] / [21] / [21/49] / [42.85...%] // [line VII]
[8] / [64] / [28] / [28/64] / [43.75%] // [line VIII]
[9] / [81] / [36] / [36/81] / [44.44p%] // [line IX]
[10] / [100] / [45] / [45/100] / [45.00%] // [line X]
[...] / [...] / [...] / [...] / [...] // [...]
This proportionally alternating data reduction, which alternates against the absolute limit <1/2 or <50%, can be used for the chain-like determination of combinations for even more complex numbers of elements of a basic inventory (e.g. phonemes or syllables) of a selected language (and for others purposes).
For example, with a total of 20 (solitary) elements of a basic language inventory (see the Sumerian language and its linguistic inventory as far as we know today), the data savings with a simple (linear) application of the method described above is already a data saving of 47.50%. This factor of possible data savings increases accordingly if the method described above is adapted into chain operations. The total capacity of the data expansion possible (for simple-linear combination chains) is already 20 placeholders (elements, e.g. phonemes or syllables). The total capacity of the data expansion possible (for simple-linear combination chains) is already 47.50/(50/100) = 95% (more on this later).
How a chain-like formation of possible (complete) combinations of a specific basic inventory of elements (e.g. a specific language) can be linked together in a chain-combinatory manner - and thus used more effectively - is explained below:
Make data yield maximally efficient through chain combinations of Cartesian products
If the results of Cartesian products in chain combination series of selected specified elements are exploited in intermediate steps and collected into new Cartesian products, enormous - maximum efficient data savings can be achieved in this way (e.g. with regard to the determination of possible combinations for phonemes and syllables for the purpose of Creation of complete phoneme and syllable databases).
For this purpose, the results (as a specific "data yield" or combination yield) of a respective specific (square) Cartesian product - always using the "backward reading option" (and the resulting mathematical procedure described above) are created into a new base stem of elements for the collection of another (square) Cartesian product and so on. In specific chain operations, this process is repeated until the desired result of data or combination yield (here: "target") is achieved: For the generation of all possible combinations of, for example, 4 solitary basic elements (e.g. phonemes or syllables), this means that this process has to be completed twice in total. This means: in order to determine all (complete) combination numbers of, for example, 4 solitaires using the backward reading option, it is necessary to carry out a Cartesian product twice in a row with the respective basic stem of elements. The basic base of elements to be raised to the self-sufficient (square) Cartesian product results from the first (square) Cartesian product. For a larger base of elements as a starting point for the first (square) Cartesian product, this process must be carried out specifically and frequently. The effort involved in data collection (e.g. for manually writing down elements and combinations (as a starting point for collecting (square) Cartesian products) increases proportionally. However, at the beginning (i.e. when collecting the first (square Cartesian product)) the effort required to generate data is specific proportionally still correspondingly low: thus, at the beginning of such chain operations, the foundations for the greatest possible (most efficient) data savings can be laid.
(later more).
Further data (e.g. when determining combinations of phonemes or syllables) can essentially be achieved by the chosen notation:
For example, an appropriately trained analyst can assign numbers to the selected specific basic inventory of phonemes or syllables and adapt this spelling to the evaluation and reading of texts. However, this step must be thought through carefully beforehand, because once trained, there is a possibility that an analyst will be strongly conditioned to the reading option acquired in this way and therefore it will only work well for a maximum of one specific language. For example, the (to our current knowledge) a total of 20 solitaires of the Sumerian language can be transcribed into numbers. The reader would simply have to learn to equate the number sequences chosen for transcribing with readable text (which in many cases requires appropriate training and.
practice may be possible). The advantage of such an approach, at least for writing (the readability of such "texts" is - as already mentioned - a different issue altogether) is data efficiency (more on this later).
The mathematical procedure used in the chain operations described above to generate possible (complete) combination databases using the principle of the (square) Cartesian product can also be explained very well using numbers for the designation of elements (theoretical is a designation Of course, this is also possible from the outset, e.g. with binary numbers; however, due to the scope required, this connection will not be explained in more detail here).
Example with a basic stem of 4 solitary elements
In order to generate all possible (complete) combinations with a basic stem of 4 solitary elements using the principle of the (square) Cartesian product, it is necessary to collect a (square) Cartesian product of specific elements twice in a row: In the first (square) In a Cartesian product, two elements are first connected to each other in a specific multiple manner. The result of such a first Cartesian product is a data set that is specifically (and maximally efficiently) reduced by using the “backward reading option”; a specific number of relevant combinations. This data set is combined into a new basic base of elements (and sorted specifically for this purpose). The selected sorting is subject to the selected and previously defined preferences (and can therefore be influenced for specific reasons).
After the generated 2nd basic stem of elements has been raised to the independent (square) Cartesian product, the possible (maximum efficient combinatorial total yield is generated, because in this second operation the K
In each operation, 2 previously combined elements (as independent, specific solitaires) are dual-combined with each other.
The sequence of the dual combination of specific solitary elements follows the following scheme:
[Step I:] a to b = a,b
[Step II]: a,b to a,b(1) = [a,b], [a,b(1)]
(see appendix 9, follows asap)
For multiple operations that result from a corresponding number of elements selected as a specific base stem, this operation method (with continuous dual combination of the resulting elements, each of which is combined to form new, independent base stems) is repeated frequently.
The main effort (here when carrying out these operational steps by hand, which can also entail corresponding health risks; please be sure to note the disclaimer attached to this article!) lies in the summary and sorting of the resulting combinations of elements - if modern Office software with table document functions are used: the step of manually writing down the data output of combinations and thus using them as new, optimally usable solitary elements for creating basic databases cannot yet be avoided (to my current, but possibly expandable, knowledge): the manual one However, recording such elements has the advantage that a large part of the data effort can be reduced in advance. This, for example, if, with appropriate expertise, the arguments of the syntax of a selected specific language can be applied to the application of the principles described above: assuming an expert can state (with absolute certainty) about a selected specific language that certain specific combinations of, for example, phonemes or syllables are superfluous because they are superfluous in the sense of the (known) syntax of a selected specific language or would never be used, then generated basic stems of combinations (of e.g. phonemes or syllables) can be thinned out in advance: the syntax of a The specific language chosen (if reliably known) is one of the main arguments for saving data when applied to the described method of creating databases for e.g. Phonemes and syllable. (However, with appropriate experience, the copy and paste functions of a specific spreadsheet software can significantly reduce the overall burden on this procedure.)
DISCLAIMER (Haftungsausschluss):
The reason I'm posting this topic in this section is because a connection to archaeomathematics automatically arises. Due to the technical possibilities, it is not easy to clearly present the necessary connections in a simple manner. I'm trying to make the best of it.
Please note that any use of the content I have written on this topic, despite careful checking, is entirely at your own risk and free of any liability on the part of me as the author. Especially for the aspects of language research and mathematics, it should be emphasized that my contributions to this topic are not peer-reviewed and I do these aspects of research purely as a hobby: I am neither a qualified mathematician nor a language researcher and have come up with all the connections myself based on logical conclusions, systematic experimentation and a little work on the subject of reading. I would particularly like to point out that my own way of presenting connections (e.g. in formulas and logical argument chains) may differ from customary international agreements. One of the most important differences in this context is the fact that I use the spelling for factorials here in the forum with this exclamation mark in front of a factorial number - and not the other way around as is usual and internationally agreed).
Please note in particular that the connections I describe in this topic are not necessarily and automatically correct (or have to be) and, despite the most careful checking, can contain errors (e.g. also formatting errors and auto-correction errors as well as grammatical errors. The ones I describe here in the topic Overall, contexts are in no way suitable as a replacement for qualified teaching or as a certified or certifiable learning aid. The texts I have written on this topic have not been proofread by others.
However, the basic principle can theoretically be transferred to e.g. phoneme inventories with any number of elements and its limits are exclusively in technical or manual feasibility. Because I think it makes sense to understand and convey the basics, I do not focus on this topic as a whole deal with the possibilities of modern IT (e.g. special software and its possible programming): in this topic I am simply showing how relatively far-reaching results can be generated using the simplest means (manually and supported by office software in the form of spreadsheets).
I must expressly warn anyone who would like to deal with this topic in more detail at this point: creating complex spreadsheet documents - e.g. in handwritten form or in the form of computer documents - can lead to extreme and even health-damaging physical stress: some of these Stress can be, for example, physical and psychological fatigue, poor ergonomic posture and strain on the tendon sheath (see e.g.: "Mouse Arm"). Please therefore pay attention to the usual warnings and note that you, as the author of this topic, are responsible for any possible consequences of a recipient's engagement with the I assume no liability whatsoever for the issues described in this topic.
Any errors that may still exist in the author's contributions can be corrected by the author at any time and without prior notice and without special highlighting/marking. The author's contributions to this topic are continuously checked for errors and, if possible, corrected promptly by the author.
The author's contributions and discussions in this topic may (unintentionally) contain serious errors in logic, conclusions, calculations, formulations and, for example, grammatical and spelling errors, among other things. In particular, the use of the content and context published by the author in this topic with regard to copyright issues and - despite the most careful check - any existing issues in this regard is entirely at the author's own risk and risk and free from any liability on the part of the author.
In the event of any use of the content and context published by the author in this topic and any resulting consequences - including legal ones - it is under no circumstances the responsibility of the author to give or express any guarantees in advance that the author has published anything in this topic Despite careful examination, the contents are not subject to any copyright protection rights (also depending on the jurisdiction of a respective country and with general reference to international, stellar, galactic, interstellar and intergalactic law). This generally also applies to any future law.
In particular, the author assumes no liability for lost sales and profits with the content and context published in this topic.
Chain-operating data reduction in (square) Cartesian products
If data efficiency-optimizing numbers of combinations are generated in chain operations using (square) Cartesian products in square matrices, the following scenario arises with regard to the maximum efficient data yield:
From the tabular overview it is clear that the linear, mutually independent and non-chained data savings when applying the principle of the Cartesian product with increasing basic value (here x) in the corresponding proportional development always corresponds to a triangular number and reflects the series of triangular numbers.
[x] / [x²] / [datasavings (x² - x)/2] / [datasavings in %] / [Line]
[1] / [1] / [0] / [0.000 %] / [Line 1]
[2] / [4] / [1] / [25.000 %] / [Line 2]
[3] / [9] / [3] / [33.333p %] / [Line 3]
[4] / [16] / [6] / [37.500 %] / [Line 4]
[5] / [25] / [10] / [40.000 %] / [Line 5]
[6] / [36] / [15] / [41.666p %] / [Line 6]
[7] / [49] / [21] / [42.857... %] / [Line 7]
[8] / [64] / [28] / [43.750 %] / [Line 8]
[9] / [81] / [36] / [44.444p %] / [Line 9]
[10] / [100] / [45] / [45.000 %] / [Line 10]
[11] / [121] / [55] / [45.455... %] / [Line 11]
[12] / [144] / [66] / [45.833p %] / [Line 12]
[13] / [169] / [78] / [46.153... %] / [Line 13]
[14] / [196] / [91] / [46.428... %] / [Line 14]
[15] / [225] / [105] / [46.666p %] / [Line 15]
[16] / [256] / [120] / [46.875 %] / [Line 16]
How much data can be saved in connected and successive chain operations when applying the principle of the Cartesian product depends on how many placeholders (e.g. 20 phonemes) are used. To explain the principle, an example with 8 placeholders (e.g. 8 phonemes) is explained here: fictitiously assuming there is a language that has only 8 phonemes in its basic inventory, then 3 consecutive chain combination operations can be carried out according to the principle of the Cartesian product generate all possible phoneme combinations of such a fictitious language with the following restriction: none of the phonemes may appear in the output line more than (...follows).
Nonlinear chained operations using the Cartesian product principle (8 placeholders):
Requirements:
Basic phoneme inventory to combine (8 phonemes total, absolutely fictional):
Phonem 1 = BA
Phonem 2 = KO
Phonem 3 = JO
Phonem 4 = PAE
Phonem 5 = TOK
Phonem 6 = PAR
Phonem 7 = POI
Phonem 8 = PON
Processes to be carried out / STEPS:
For reasons of space, only the results can be listed here in the specific order chosen. The tabular overviews used to generate the Cartesian products are not shown here for reasons of space and due to the scope required. The data is first output as numerical codes and then transferred to phonetic databases, in which each digit is assigned the corresponding phoneme.
In the first combining step, each phoneme is combined with every other phoneme to form the resulting possible dual combinations (pairs):
STEP I:
Combining all placeholders into pairs in all possible combinations. The display of mirrored content is no longer necessary (use of the “Read Backwards” option; A slash separates the combinations; a double slash with a minus sign in between marks a specific sequence.)
results (36 combinations) as numbers:
1,1/2,1/3,1/4,1/5,1/6,1/7,1/8,1/-/
2,2/2,3/2,4/2,5/2,6/2,7/2,8/-/
3,3/4,3/5,3/6,3/7,3/8,3/-/
4,4/5,4/6,4/7,4/8,4/-/
5,5/6,5/7,5/8,5/-/
6,6/7,6/8,6/-/
7,7/8,7/-/
8,8
results (36 combinations) as phonems; transliterated:
BA,BA/KO,BA/JO,BA/PAE,BA/TOK,BA/PAR,BA/POI,BA/PON,BA/-/
KO,KO/JO,KO/PAE,KO/TOK,KO/PAR,KO/POI,KO/PON,KO/-/
JO,JO/PAE,JO/TOK,JO/PAR,JO/POI,JO/PON,JO/-/
PAE,PAE/TOK,PAE/PAR,PAE/POI,PAE/PON,PAE/-/
TOK,TOK/PAR,TOK/POI,TOK/PON,TOK/-/
PAR,PAR/POI,PAR/PON,PAR/-/
POI,POI/PON,POI/-/
PON,PON
(more content will follow soon)
STEP II:
From step 2 onwards, the advantages of modern electronic data processing and the use of spreadsheet calculation software begin to take effect: via copy & paste and using the option of spreadsheet calculation software, the work steps for determining all the resulting combinations, which would be extremely time-consuming to write by hand, can be carried out with the appropriate concept very short time (see appendix 10). In less than 1 hour, perhaps even in about half an hour, it is possible to determine all relevant combinations of 36X36 and create them as a spreadsheet.
Determination of the number of combinations by calculating the specific sum number (as a triangular number):
A simple but mathematically complex method, depending on the method used, to determine the number of possible combinations when using the "Read backwards" option when generating a Cartesian product (see data savings) is the following: for this, the specific sum of directly consecutive ones is simply added up Elements determined according to the principle of the series of triangular numbers and their development. This results in the corresponding number of combinations for 36X36 combinations = SUM(36) = {1+2+3+4+5+ ... + ... 32+33+34+35+36} = 666. With modern spreadsheet software, determining such totals is very easy and can be done in just a few seconds.
STEP III:
Step III is not presented here due to the scope and effort required. Elevating 666 elements to the 666X666 Cartesian product would yield a total of 443556 combinations, which when reduced by applying the Read Backward option would yield 222111 combinations.
Creating such a database using manual methods using conventional spreadsheet software can be accomplished in a relatively short period of time, perhaps even in a single day.
However, I would like to avoid this work at this point, as the main focus of this topic article is on the Sumerian language with a phoneme inventory of 20 phonemes, which requires a different approach when creating corresponding databases.
Summary:
This work may seem very laborious. However, if the individual work steps are optimized and, if necessary, transferred to many different people, it is realistically possible to create basic databases with, for example, numbers as placeholders that are suitable for transfer to databases with, for example, phonemes or syllables. Once this basic work is done, the basic databases do not need to be created again for any further action. Added to this are the modern possibilities of programmed, automated creation of such databases using software that is to be developed (or possibly already developed) specifically for this purpose.
Another advantage of using the method described above is to be able to better research the creation algorithms of the combination sequences that result in combined Cartesian products in order to determine simplifications for the development of combination series.
I also save myself here from having to verbalize the combination rows created in Step II with placeholders filled with numbers using the phonetic fantasy phonemes from the example mentioned above for illustration purposes. I think the basic principle of the possible procedure has now been sufficiently discussed and explained.
Possible procedure for creating a phoneme database with 20 placeholders as a basic inventory:
The overall procedure for creating a phoneme database with 20 placeholders as a basic inventory requires some preliminary considerations regarding the total effort involved: here the application of specific mathematics helps us in advance to avoid the worst errors and thus saves unnecessary work and time.
If we plan to create a complete, gapless phoneme database with 20 placeholders as a basic inventory, we must be aware in advance that the desired database is extremely large and requires a corresponding amount of effort to create: every logical error at the beginning will eventually take its toll over the course of the project and can take hours, destroy entire days, weeks and months of work. The basic mistakes that tend to creep in at the beginning of such an undertaking are best avoided by thinking through such a project in detail beforehand.
The main mathematical aspects that should definitely be considered before starting such a mammoth project are the following:
Determining the number of placeholders
The amount of data required later in the project depends on the preliminary determination of the number of placeholders. The data effort of such a project is specifically proportional to the number of predefined placeholders.
When determining in advance the number of placeholders and the question of what meaning should be assigned to them (e.g. phonemes, syllables), it is important to consider whether we can really be sure about how many phonemes or syllables a specific language that has been researched to date actually has. It is also essential that we are aware of the different effects that result from the application of the appropriate method with which we create such a database. Here, a distinction must be made between two basic methods in the context of my previous statements on this topic, namely, as already mentioned, between the method using the Cartesian product principle with the combined "reading backwards" option and/or the method I have described. "Telephone book method". I will discuss the “phone book method” separately in a follow-up post on this topic.
You must be aware of the following project scope when 20 placeholders are used to create databases by concatenating Cartesian products with respective intermediate result transfers when using the "read backwards" option. A crucial question is also how many placeholders of readable text lined up one after the other in a specific language (here in the example of Sumerian) will ultimately want to create. The preliminary clarification of the questions mentioned ultimately has a direct influence on which steps we decide on in the planned project.:
20 placeholders create basic inventory 20X20 = 400 combinations (without gaps) when combining 2 placeholders (basic combinations or basic combinations); With the "Read backwards" option, SUM(20) = 210 combinations are generated. The data savings when using the "Backwards read" option creates a data saving of 190 combinations in the 1st process step. The following tabular overview provides information about how this data effort series develops, if we actually aim to combine 20 placeholders with each other without any gaps to ultimately form 20 elements (placeholders) of readable text arranged in a row. The resulting superset (master set) of readable text that is output is directly mathematically dependent on the respective process step and the selected generation method:
Chain algorithmic generation of combinations with 20 placeholders basic inventory:
(Selected method: Data-efficient concatenation of Cartesian products without the "read backwards" option.)
x = number of placeholders (basic inventory)
data saving in steps; SUM(x) = [(x² - x) : 2]
AXB = generated Cartesian product
carry(x) = Carry for x in the next step; carry(x) = SUM(x)
[x] / [AXB]/ [x²] / [SUM(x)] / [carry(x)] / [line]
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[20] / [20X20]/ [400] / [SUM(20)] / [carry(210)] / [line1]
[210] / [210X210]/ [44100] / [SUM(210)] / [carry(22155)] / [line2]
[22155] / [22155X22155]/ [490844025] / [SUM(22155)] / [carry(245410935)] / [line3]
...
The small list quickly makes it clear what an enormous amount of data we would be confronted with in such a project. Therefore, the main focus of such a project should be on every possible measure that can reduce further data expenditure. One of the most essential ways to achieve this is fundamental knowledge of the syntax of a specific language. Researching the syntax of a specific language being researched in detail is, among other things, Indispensable for the reasons mentioned here of possible data savings in decryption attempts. More about the strategies of how knowledge about the syntax of a specific language can help us to draw conclusions about possible decryption attempts or complete mapping attempts using the methods already presented here and those still to be discussed in the following.
DISCLAIMER:
Any use of the content posted by the author in this topic is entirely at your own risk and free of any liability on the part of the author.
SOURCES / BIBLIOGRAPHY:
[books]:
du Sautoy, M.: Die Musik der Primzahlen - Auf den Spuren des größten Rätsels der Mathematik. 7. Aufl. ungek. Ausg., Verlag dtv Wissen, München, 2013
Reiss / Schmieder: Basiswissen Zahlentheorie - eine Einführung (Reihe: Mathematik für das Lehramt) - Eine Einführung in Zahlen und Zahlbereiche. 2. Aufl. Verlag Springer, Berlin, Heidelberg, New York, 2007.
Schmidt / Trenkler: Moderne Matrix-Algebra. Mit Anwendungen in der Statistik. Verlag Springer, Berlin, Heidelberg, New York, 1998.
[gWiki; german-language Wikipedia]:
[gWiki1]:
Bibliografische Angaben für „Sumerische Sprache“
Seitentitel: Sumerische Sprache
Herausgeber: Wikipedia – Die freie Enzyklopädie.
Autor(en): Wikipedia-Autoren, siehe Versionsgeschichte
Datum der letzten Bearbeitung: 9. September 2023, 21:00 UTC
Versions-ID der Seite: 237177728
Permanentlink: https://de.wikipedia.org/w/index.php?ti ... =237177728
Datum des Abrufs: 1. Januar 2024, 11:03 UTC
[gWiki2]:
Bibliografische Angaben für „Georg Cantor“
Seitentitel: Georg Cantor
Herausgeber: Wikipedia – Die freie Enzyklopädie.
Autor(en): Wikipedia-Autoren, siehe Versionsgeschichte
Datum der letzten Bearbeitung: 14. Dezember 2023, 12:47 UTC
Versions-ID der Seite: 240179661
Permanentlink: https://de.wikipedia.org/w/index.php?ti ... =240179661
Datum des Abrufs: 7. Januar 2024, 11:43 UTC
[gWiki3]:
Bibliografische Angaben für „Carl Friedrich Gauß“
Seitentitel: Carl Friedrich Gauß
Herausgeber: Wikipedia – Die freie Enzyklopädie.
Autor(en): Wikipedia-Autoren, siehe Versionsgeschichte
Datum der letzten Bearbeitung: 23. Dezember 2023, 07:46 UTC
Versions-ID der Seite: 240453755
Permanentlink: https://de.wikipedia.org/w/index.php?ti ... =240453755
Datum des Abrufs: 7. Januar 2024, 15:10 UTC
applies. You can find this Disclaimer attached to the author's first post in this topic. The article is currently being edited and may still contain errors. -
Create complete phoneme databases of historical languages
With relative effort and simple means, it is possible to create databases for language research that expand understanding of languages that have been lost or are still researched.
In this topic I would like to explain step by step how, with appropriate (manual) effort, databases can be created, for example for phonemes, syllables, pronunciation, etc. created using conventional office software (complex spreadsheet documents).
The purpose of this project is to demonstrate one area of possibilities in which mathematics (combinatorics) can support language research. This is based on the assumption that a database with, for example, all possible linguistic and textual expression options can expand and enhance one's own engagement with a researched language. For reasons of effort, the scope of this project must be limited to texts or text modules with very few placeholders (e.g. phonemes or syllables). However, the principle can be expanded accordingly with appropriate technical know-how or human resources.
The connections to be described below are equally suitable for gaining an insight into the combinatorics that are useful for deciphering texts and is therefore necessary (cryptological basics). Since the overall topic is very complex, I needed several attempts here in the forum to find a suitable form for this context.
The Sumerian language and its associated basic phonemic inventory
(This topic cannot possibly provide an overview of the overall aspects of the Sumerian language. Please refer to the relevant sources listed in the bibliography, for example.)
The Sumerian language is well suited to provide a formal introduction to the topic of this post: the phoneme inventory of the Sumerian language has (to my knowledge and with reference to the sources I used) four vowels (/a e i u/) opposite 16 consonants, which makes the Sumerian language "quite simple" [gwiki1]. The advantage of this combination and number of vowels and consonants - from a mathematical point of view - is that a combination of 4 + 16 elements can be viewed in a very clear combinatorial manner - and therefore easy to represent. This helps to explain mathematical and combinatorial (cryptological) connections on the topic well and clearly.
Combinatorial basics for creating databases for e.g. phonemes or syllables of a language
From a combinatorial perspective, the (possible and special principle described here) of creating a database - e.g. for phonemes or syllables - of a language (explained here using the selected example of the Sumerian language) can be explained as follows. The procedure is split into various successive basic steps. These steps can all be done either by hand or in the form of computer table documents (please be sure to note the warnings in the disclaimer!):
Step 1:
In order to create the database, the analyzed language and - ideally a completely known phoneme inventory* - must first be determined. This definition also determines the number of elements to be mathematically (combinatorically) combined with one another as well as the resulting “chain combinations”.
Due to the scope required, I will only address the special case of (to our current knowledge) "completely known" language phoneme inventories in passing. For the sake of simplicity, at this point in the discussion I am assuming that the phoneme inventory of the Sumerian language known to us today - to our current knowledge - has been fully and completely deciphered (However, as an actual amateur and non-linguist, I could be wrong about this; my main focus here is on mathematical questions about the structure and deciphering of languages. In order to be able to explain the relevant procedures well, I need a database that is as clearly clarified as possible.).
If I continue to deal with the Sumerian language and its phoneme inventory, for the sake of simplicity (in order to be able to explain the entire procedure described here well) I will assume that the Sumerian language actually has a total of 20 phonemes (4 vowels and 16 associated ones).
Step2:
The specified number of elements is combined in chain combinations until the desired number of placeholders (more on this later) lined up one after the other is achieved: in order to get an introduction to the topic, In the following, I will initially use a smaller number of elements to explain the respective principle for combining elements that can be combined with each other (e.g. phonemes or e.g. also syllables) - there are different methods for this. In order to understand how the elements are combined with each other and what mathematical results result from this, it is necessary to deal with the mathematical concept of the Cartesian product (more on this later) and later with a principle - also in direct comparison with the principle of the Cartersian product - that I call the "telephone book principle".
Step3:
If all elements of a project (e.g. phonemes or syllables) - if technical; e.g. manually; possible - have been successfully combined with one another, the results can be combined into a specific database (more on the benefits and use of such a database later).
About the Cartesian product
The term Cartesian propuct is a term from the (so-called "naive") set theory (according to Cantor) [Reiss/Schmieder, 2007]; [gwiki2]. The Cartesian product describes the “crossed” (simple) combination of elements of two sets. The Cartesian product can be clearly explained using two equal sets (i.e. two sets that contain exactly the same elements). If we look at the sets A, B with A = {1, 2, 3} and B = {1, 2, 3}, we get a good basis for explaining the Cartesian product. At this point it´s necessary to explain that a mathematical set in the sense of set theory can contain an unordered specific number of elements. However a tuple is commonly spoken of in the mathematical sense when we consider an ordered number of elements: a set can therefore contain both an ordered and an unordered number of elements, a tuple always contains a number of ordered elements. For example, in the field of computer science, the ordered consideration of numbers of elements (tuples) is often used [Reiss/Schmieder, 2007].
For the combinatorial creation of phoneme databases for languages, for example, it makes sense, but is generally not important, whether and how we organize the phonemes of the phoneme inventory (here using the example of the Sumerian language with 4 vowels and 16 associated consonants): we can therefore use sets or tuples, just like we prefer it: In the context in which we define our own definition of order for a basic data base of elements, this order in the mathematical sense corresponds - in my opinion - not to a tuple (depending on the definition), but in any case to a specifically ordered set. However, due to these mathematical pitfalls when it comes to defining an "ordered number of elements", it is easier for me to choose mathematical ordering systems that are as close as possible - e.g. according to the order of numbers or letters in specific systems - so that I can call such ordered sets "tuples".
This step fulfills a purpose - which I will discuss later - but which does not change the overall result of "chain combinatorics": this is because in this topic I only explain the creation of complete element databases, i.e. all elements are combined with each other until all possible combinations have been exhausted. The orderly consideration of elements to be combined therefore fulfills its purpose more in a simplified mathematical consideration and in the subsequent possible assignment of elements in the form of an index (more on this later).
The purpose of such an index - as far as I know so far (which I must consciously formulate as a conjecture here) - is that "linguistic peculiarities", e.g. those of a specific author of historical documents, can be better assigned, among other things.
Elevating a defined number of elements to a Cartesian product means, so to speak, "crossing" every single element with every other element (e.g.) of a set (here this term is meant exclusively in the strictly mathematical sense).
We can therefore write in simplified terms, e.g.:
Cartesian product from A to B or A to A(1) or in the notation I use from A(0) to A(1) = AχB
Applied to a simple ("two-dimensional") tabular structure, this means that every element of the set/tuple A(0) is combined with every element of the set/tuple A(1). Important for "language research" in the sense of creating phonetic databases, for example, is to aim for A(0) = A(1) (the exact same elements should be contained in set/tuple A as in set/tuple A( 1). This condition does not necessarily have to be met in the sense of the topic described here, but it makes combinatorial work much easier. The result of applying this principle is therefore a very good, clear "data economy" (in the sense of "data economy"). This type the approach actually saves (experience has shown) time and concentration (when working), e.g.:
A(0) = {1, 2, 3} and A(1) = {1, 2, 3}
You can also write: "The sets A and B are equally powerful (because each of the two sets contains exactly the same elements)". Furthermore, when considering tuples it could be written: "Tuple A and tuple B have the same power and contain exactly the same elements with exactly the same number of elements.", e.g.:
(x1 ... x3)(0) = {1, 2, 3}
(x1 ... x3)(1) = {1, 2, 3}
This connection can be presented in a table, for example, as follows (here in notation with tuples) (see appendix 1).
(Note: The separation of elements with commas does not represent comma numbers in the sense of German spelling, but rather a list of elements that belong to one another.)
The elements ordered (here according to a certain scheme) arise from A(0)XA(1) as:
A(0)XA(1) = {(1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3)}
which therefore represent the Cartesian product of A(0) and A(1).
Upon closer analysis of the Cartesian product of A(0)XA(1), two particularly noteworthy aspects emerge:
The table can be mathematically referred to as a square matrix or "matrix". Because the number space of natural numbers can be described with the properties of "multidimensionality", I also refer to such a matrix as "two-dimensional".
In the matrix, two reflecting areas (across the main diagonal of the matrix) can be identified, as well as the elements located on the main diagonal. If the matrix is broken down into cell contents in the form of a table and described, a matrix structure is created that can be divided into 3 main areas. I basically call these areas according to the areas: alpha (α), beta (β), gamma (γ) (see appendix 2). This classification and the names I have chosen here should in no way be viewed as fixed. Mathematical expressions are always open to discussion and this method of representation and naming is only an example. This classification and chosen name represents one of many different options and serves to better describe the matrix contents. In the following it is now possible to formally describe the matrix and its contents:
R(mxn) is a squarematrix if m = n (more on this later).
Applied to the possible combination of phonemes, for example, the principle of the Cartesian product with its specific results should not be confused with the following combinatorial notation, in which all elements of a set are crossed with all elements of the same set and represented in all possible notations ("dictionary notation" or "telephone book notation"), at:
{(a, b, c) ∈ A, (a, b, c) ∈ B}
A = {a, b, c}
B = {a, b, c}
linear combination options for A, B:
1, 1, 1 or a, a, a [line I]
1, 1, 2 or a, a, b [line II]
1, 1, 3 or a, a, c [line III]
1, 2, 1 or a, b, a [line IV]
1, 2, 2 or a, b, b [line V]
1, 2, 3 or a, b, c [line VI]
1, 3, 1 or a, c, a [line VII]
1, 3, 2 or a, c, b [line VIII]
1, 3, 3 or a, c, c [line IX]
2, 1, 1 or b, a, a [line X]
2, 1, 2 or b, a, b [line XI]
2, 1, 3 or b, a, c [line XII]
2, 2, 1 or b, b, a [line XIII]
2, 2, 2 or b, b, b [line XIV]
2, 2, 3 or b, b, c [line XV]
2, 3, 1 or b, c, a [line XVI]
2, 3, 2 or b, c, b [line XVII]
2, 3, 3 or b, c, c [line IXX]
3, 1, 1 or c, a, a [line XX]
3, 1, 2 or c, a, b [line XXI]
3, 1, 3 or c, a, c [line XXII]
3, 2, 1 or c, b, a [line XXIIII]
3, 2, 2 or c, b, b [line XXIV]
3, 2, 3 or c, b, c [line XXV]
3, 3, 1 or c, c, a [line XXVI]
3, 3, 2 or c, c, b [line XXVII]
3, 3, 3 or c, c, c [line XXVIII]
This notation is also of great importance for combintorics in the sense of "linguistic research" and e.g. deciphering historical texts with their mathematically resulting possibilities - and to say it in advance: it is the completely complete, but extremely complex method. With the small example of 3 elements to be combined, it produces a different number of combination possibilities, namely 3*3*3 = 3^3 = 27 possibilities, while elevating comparatively 4 elements to the Cartesian product to the Cartesian product (here "first stage", or 1st evolution) results in a number of 4*4 combinations = 16 combinations (when using 4 elements, basic base of elements that are raised to the Cartesian product).
The differences and effects to generate combinations between these two essential methods will be discussed later.
In short, what you may have already noticed: A Cartesian product (of two numbers of elements combined with each other always produces a square number of possible combinations, while the so-called “dictionary notation” always produces a number of possible combinations the size of a proportional number according to the principle of powers, e.g.:
Table of resulting combinations depending on the used principle:
comb = "combinations"
CP = Cartesian Product (here always "squared" Cartesian product)
TBP = Telephone Boon Principle
x = number of elements contained in the base of elements to be combined
[x] / [comb. CP] / [comb. TBP] // [line]
[1] / [1^2 = 1] / [1^1 = 1] // [line]
[2] / [2^2 = 4] / [2^2 = 4] // [line]
[3] / [3^2 =9] / [3^3 = 27] // [line]
[4] / [4^2 = 16] / [4^4 = 256] // [line]
[5] / [5^2 = 25] / [5^5 = 3125] // [line]
[6] / [6^2 = 36] / [6^6 = 46656] // [line]
[7] / [7^2 = 49] / [7^7 = 823543] // [line]
etc.etc.
As is clear from the specifically proportionally strong increase in results when using the (maximally complete, i.e. gap-free) method of the telephone book principle, one of the main objectives is to have databases that are as complete as possible for, for example, phonemes or syllables - or for the encoding or deciphering of texts, for example Number of possible combinations to analyze
as much as possible (and as far in advance of a project as possible). The application of the principle of the Cartesian product, among other essential methods, is suitable for exactly this purpose; set at a specifically sensible time in a project and of course used at the appropriate point.
The characteristic that a Cartesian product (with a single exception which will be discussed later - exclusively in a square matrix, i.e. with A = B) always generates a square number of possible combinations, is important in terms of data economy when creating complete tabular data lists (here databases with e.g. phonemes or syllabic databases) have a big advantage, which I will explain below.
First of all: both methods offer their own specific advantages and possible applications, which depend on what is to be achieved. Both methods can be combined with each other if this makes sense. However, I will discuss the different methods (and the possible combination of the two dirrefent methods in specific areas) one after the other and initially concentrate on the method of determining combinations using the Cartesian product principle.
Determine chain combinations by creating and combining Cartesian products
The big and special advantage for determining combinations (chain combinations) using the principle of the Cartesian product is that Cartesian products when applied to square matrices always output a square number of possible combinations, specifically sorted (please note the only exception here: I will explain further below). This connection can be used up to a certain resource-related limit when manually determining (in this case in tabular form) phoneme or syllable combinations (as well as letter combinations, for example). “Resources” here mean factors such as writing and computing time and the time that you are able to devote to concentrated work (which can of course vary greatly from person to person).
In order to better explain what this advantage is, a short excursion into the world of so-called figurable numbers and thus into the world of triangular numbers and square numbers as well as the Gaussian formula for triangular numbers is necessary.
As the young Gauss (1777 - 1855) [gWiki8], one of the most famous mathematicians in the world today and probably of all time, recognized at a young age, square numbers in the number space of natural numbers can always be formed according to the same scheme by combining two directly successive triangular numbers. Even if it is strongly assumed that the ancient Greeks, for example, were aware of this connection, Gauss is considered to be the first who verifiably published about this and summarized the principle in a formula [duSautoy,39].
The principle of forming square numbers can be explained relatively simply using two series of numbers (triangular numbers) placed next to each other. The formation of the triangular numbers in turn follows their very own - characteristic - recursivity, whereby the recursivity of both number series; of triangular numbers and square numbers; can be conclusively related to each other in a tabular overview of number series (this represents one of a few possible structural proofs, about which I cannot judge at this point, based on my current knowledge, to what extent they are generally known).
Squarenumbers and their formation:
at:
ℕ = {0, 1, 2, 3, 4, 5, 6, 7, ...}
x² = squarenumbers
x² ∈ ℕ = {0, 1, 4, 9, 16, 25, 36, 49 ...}
▲ = triangular numbers
▲ ∈ ℕ = {0, 1, 3, 6, 10, 15, 21, 28, 36 ...}
Evolution of triangular numbers in ℕ:
0+1 = 1
1+2 = 1
1+2+3 = 6
1+2+3+4 = 10
1+2+3+4+5 = 15
1+2+3+4+5+6 = 21
1+2+3+4+5+6+7 = 28
1+2+3+4+5+6+7+8 = 36
alternatively you can write e.g:
▲(1) = ∑(0...1) = 1
▲(2) = ∑(0...2) = 3
▲(3) = ∑(0...3) = 6
▲(4) = ∑(0...4) = 10
etc.etc.
Another way I use for triangular numbers is:
▲∑(1) = 1
▲∑(2) = 3
▲∑(3) = 6
▲∑(4) = 10
etc.etc.
Evolution of square numbers in ℕ:
0 + 1 = 1
1 + 3 = 4
3 + 6 = 9
6 + 10 = 16
10 + 15 = 25
15 + 21 = 36
etc.etc.
The developmental recursivity of triangular numbers and square numbers:
As a structural proof of the recursivity of triangular numbers and square numbers, the following connection essentially represents a modified excerpt from Pasqual's triangle (see Appendix 3).
The graphical representation is suitable for clarifying the structural connection between triangular numbers and square numbers (see Figure 4; slight distortions in the representation are possible due to screen technology and formatting technology, so that the figures shown are not 100% correctly displayed as square figures).
Developmental connection in the defined matrix between triangular numbers and square numbers
The graphical representation shows how triangular numbers and square numbers can be viewed and defined quantitatively in their mutually dependent, definable evolution: a quantitative part of the larger specific triangular number, which forms a correspondingly specific square number with a preceding specific triangular number, can be quantitatively assigned to the main diagonal of one square matrix and thus the elements lying on the main diagonal of a specific square matrix means: the elements located in the square matrix previously defined here with the specific designation (gamma) can be set theoretically 8depending on the point of view) either of the element set (alpha or the set of elements (beta) can be assigned in the matrix. This allows statements about the formation of specific sets to be described in a simplified manner. However, in view of the benefit of the defined matrix, set theory makes it easier to identify the elements located on the main diagonal of the defined specific matrix the fraction (gamma) should be viewed as an independent set.
With regard to the formation of square numbers from directly successive specific triangular numbers, it makes sense to introduce a further consideration of the square figure: a specific square figure with side length x can be designed with x² elements (e.g. square pieces of cardboard or mussel shells arranged in a square). This definition establishes the connection to the figured numbers, which was already intensively researched and discussed by the ancient Greeks. By referring to the figurable numbers, it can also be directly transferred to calculating with the matrix defined here:
Set-theoretically, of a number of x² elements (which are each specifically formed from the sets Alpha, Beta, Gamma), a number of to express how square numbers can be formed from triangular numbers. Furthermore, further fundamental statements about square numbers (and thus square figures) are possible, which can be directly transferred to the handling of the matrix defined here:
statement: the elements contained in the set Alpha+Beta+Gamma of a square matrix can always be divided according to the following key if we define and view the square matrix as a (square) Cartesian product: the elements contained in the Alpha and Beta fraction regions of the matrix reflect each other in a definable way, while the elements in the Gamma fraction region each have a specific unique selling point if we consider the elements of the Matrix as sets and their elements. This results in set theory:
This set-theoretic connection can be used to create, for example, phonemic or syllabic databases in order to achieve the most efficient data economy.
EXAMPLE:
If a number of 4 different elements are raised to form a (square) Cartesian product, the following possible combinations arise, e.g.:
A(0), A(1) = {1,,2 3, 4}
A(0) = A(1)
resulting combinations (in a specially defined sorting):
(see appendix 6)
{A(0)XA(1) ∈ ℕ / (a, b, c, d), (a = 1, b = 2, c = 3, d = 4), (a, b, c, d ∈ A(0), a, b, c, d ∈ A(1) / A(0)XA(1) = (Α,Β,Γ)}
A(0)XA(1) ∈ ℕ - beta = {(1,1), (2,1), (3,1), (4,1), (2,2), (3,2), (4,2), (3,3), (4,3), (4,4)}
When (manually) creating databases for, for example, phonemes and syllables of a specific language, a - specific - considerable amount of data (according to a fixed calculation key) can be saved if the "backward reading" option is used.
In this context, "reading backwards" is actually meant in the literal sense: with the appropriate disposition or experience and training, it is possible to recognize the meaning of text even in texts written backwards. Experienced readers can make use of this effect when creating, for example, phoneme databases based on the principle of using the Cartesian product of basic data sets.
Creating a database (here as an example) for phonemes or syllables, for example, using the principle of the Cartesian product works according to the method described below:
Create a database manually, e.g. for phonemes or syllables, using the principle of the Cartesian product:
In order to create a database manually, for example for phonemes or syllables, using the principle of the Cartesian product, it is necessary to implement the following steps for the principle of creating a database described here:
Step 1: Define the basic inventory of phonemes or syllables.
Step 2: Collect elements of the basic inventory to form a (square) Cartesian product.
Step 3: Eliminate unnecessary data when using the “read backwards” option or save it from the start.
(Note: Steps 2 and 3 are repeated frequently depending on the desired combination result in the sense of "chain combinations" - more on this later.)
To the individual steps:
Step 1: Define the basic inventory of phonemes or syllables
For the following application example, I define a selected (here fictitious) basic inventory of syllables. The individual syllables as elements of the set A(ß) with altogether 4 different elements are here:
1 = bo
2 = kun
3 = tun
4 = tos
A(0) = {bo, kun, tun, tos}
The individual syllables can optionally be represented as numbers in a square matrix. If A(0) and A(1) are now raised to the Cartesian product at A(0) = A(1), the corresponding (all) possible combinations of the syllables arise in the resulting square matrix (see appendix 7 and 8).
How useful this approach can be for creating databases with complete combination numbers of, for example, phonemes or syllables becomes clear when (also fictional) translations (here in the example into English) are assigned to the fantasy syllables, e.g.:
1 = bo = "boat"
2 = kun = "far"
3 = do = "wide water"
4 = tos = "move"
The (complete and gap-free combinations) created in the square matrix as a (square) Cartesian product can be compared to the reduced set of elements: it becomes clear here that the textual contexts of meaning can also be grasped (with appropriate skills or training), if they are recorded reading backwards:
Overall combinations determined in the example used here for four different (autonomous) syllables (here also “solitary”):
(listed in the specific sorting selected here)
Basic inventory of defined solitaires (here syllables) and their assignment:
1 = bo = "boat"
2 = kun = "far"
3 = tun = "wide water"
4 = tos = "move"
Resulting syllable constellations in A(0)XA(1) (squared Cartesian Product):
centered:
I: bo,bo
II: kun,kun
III: tun,tun
IV: tos,tos
- - -
{regular} (centered) [mirrored]
I: {kun,bo} [bo,kun]
II: {tun,bo} [bo,tun]
III: {tos,bo} {tun,kun} [kun,tun] [bo,tos]
IV: {tos,kun} [kun,tos]
V: {tos,tun} [tun,tos]
Resulting syllable constellations in A(0)XA(1) (squared Cartesian Product) "translated":
centered:
I: boat,boat
II: far,far
III: wide water,wide water
IV: move,move
- - -
{regular} (centered) [mirrored]
I: {far,boat} [boat,far]
II: {wide water,boat} [boat,wide water]
III: {move,boat} {wide water,far} [far,wide water] [boat,move]
IV: {move,far} [far,move]
V: {move,wide water} [wide water,move]
The meaning of this combinatorial and combination-eliminating approach becomes clear when we consider the data economic effect of this approach in chain combinations.
Chain combinatorial data economics in (square) Cartesian products
The previous explanations have shown that when determining the number of combinations in square matrices using the principle of (square) Cartesian products, elements (and thus data) can be eliminated if the "read backwards" option is used in the result output. This “data economy” or “data efficiency” can be precisely calculated in terms of an “ideally efficient data yield”. The corresponding calculation options for triangular numbers and square numbers are used for this purpose. These relationships can be clearly illustrated in a table overview:
Data economic yield when applying the read-back option in square Cartesian products:
generally for squarematrixes A(0)XA(1); "elements" = e.g. phonems or syllables
Tabloe-columns:
[C1] = [elements; x]
[C2] = [combinations; x²]
[C3] = [data saving (combinations; proportionate)]
[C4] = [data saving (combinations; fractional)]
[C5] = [data saving (combinations; percent)]
[C1] / [C2] / [C3] / [C4] / [C5] // [line]
[x] / [x²] / [prop.] / [fract.] / [%]
- - - - - - - - - - - - - - - - - - - - - - - - - -
[1] / [1] / [0] / [-] / [0.00%] // [line I]
[2] / [4] / [1] / [1/4] / [25.00%] // [line II]
[3] / [9] / [3] / [3/9] / [33.33p%] // [line III]
[4] / [16] / [6] / [6/16] / [37.50%] // [line IV]
[5] / [25] / [10] / [10/25] / [40.00%] // [line V]
[6] / [36] / [15] / [15/36] / [41.66p%] // [line VI]
[7] / [49] / [21] / [21/49] / [42.85...%] // [line VII]
[8] / [64] / [28] / [28/64] / [43.75%] // [line VIII]
[9] / [81] / [36] / [36/81] / [44.44p%] // [line IX]
[10] / [100] / [45] / [45/100] / [45.00%] // [line X]
[...] / [...] / [...] / [...] / [...] // [...]
This proportionally alternating data reduction, which alternates against the absolute limit <1/2 or <50%, can be used for the chain-like determination of combinations for even more complex numbers of elements of a basic inventory (e.g. phonemes or syllables) of a selected language (and for others purposes).
For example, with a total of 20 (solitary) elements of a basic language inventory (see the Sumerian language and its linguistic inventory as far as we know today), the data savings with a simple (linear) application of the method described above is already a data saving of 47.50%. This factor of possible data savings increases accordingly if the method described above is adapted into chain operations. The total capacity of the data expansion possible (for simple-linear combination chains) is already 20 placeholders (elements, e.g. phonemes or syllables). The total capacity of the data expansion possible (for simple-linear combination chains) is already 47.50/(50/100) = 95% (more on this later).
How a chain-like formation of possible (complete) combinations of a specific basic inventory of elements (e.g. a specific language) can be linked together in a chain-combinatory manner - and thus used more effectively - is explained below:
Make data yield maximally efficient through chain combinations of Cartesian products
If the results of Cartesian products in chain combination series of selected specified elements are exploited in intermediate steps and collected into new Cartesian products, enormous - maximum efficient data savings can be achieved in this way (e.g. with regard to the determination of possible combinations for phonemes and syllables for the purpose of Creation of complete phoneme and syllable databases).
For this purpose, the results (as a specific "data yield" or combination yield) of a respective specific (square) Cartesian product - always using the "backward reading option" (and the resulting mathematical procedure described above) are created into a new base stem of elements for the collection of another (square) Cartesian product and so on. In specific chain operations, this process is repeated until the desired result of data or combination yield (here: "target") is achieved: For the generation of all possible combinations of, for example, 4 solitary basic elements (e.g. phonemes or syllables), this means that this process has to be completed twice in total. This means: in order to determine all (complete) combination numbers of, for example, 4 solitaires using the backward reading option, it is necessary to carry out a Cartesian product twice in a row with the respective basic stem of elements. The basic base of elements to be raised to the self-sufficient (square) Cartesian product results from the first (square) Cartesian product. For a larger base of elements as a starting point for the first (square) Cartesian product, this process must be carried out specifically and frequently. The effort involved in data collection (e.g. for manually writing down elements and combinations (as a starting point for collecting (square) Cartesian products) increases proportionally. However, at the beginning (i.e. when collecting the first (square Cartesian product)) the effort required to generate data is specific proportionally still correspondingly low: thus, at the beginning of such chain operations, the foundations for the greatest possible (most efficient) data savings can be laid.
(later more).
Further data (e.g. when determining combinations of phonemes or syllables) can essentially be achieved by the chosen notation:
For example, an appropriately trained analyst can assign numbers to the selected specific basic inventory of phonemes or syllables and adapt this spelling to the evaluation and reading of texts. However, this step must be thought through carefully beforehand, because once trained, there is a possibility that an analyst will be strongly conditioned to the reading option acquired in this way and therefore it will only work well for a maximum of one specific language. For example, the (to our current knowledge) a total of 20 solitaires of the Sumerian language can be transcribed into numbers. The reader would simply have to learn to equate the number sequences chosen for transcribing with readable text (which in many cases requires appropriate training and.
practice may be possible). The advantage of such an approach, at least for writing (the readability of such "texts" is - as already mentioned - a different issue altogether) is data efficiency (more on this later).
The mathematical procedure used in the chain operations described above to generate possible (complete) combination databases using the principle of the (square) Cartesian product can also be explained very well using numbers for the designation of elements (theoretical is a designation Of course, this is also possible from the outset, e.g. with binary numbers; however, due to the scope required, this connection will not be explained in more detail here).
Example with a basic stem of 4 solitary elements
In order to generate all possible (complete) combinations with a basic stem of 4 solitary elements using the principle of the (square) Cartesian product, it is necessary to collect a (square) Cartesian product of specific elements twice in a row: In the first (square) In a Cartesian product, two elements are first connected to each other in a specific multiple manner. The result of such a first Cartesian product is a data set that is specifically (and maximally efficiently) reduced by using the “backward reading option”; a specific number of relevant combinations. This data set is combined into a new basic base of elements (and sorted specifically for this purpose). The selected sorting is subject to the selected and previously defined preferences (and can therefore be influenced for specific reasons).
After the generated 2nd basic stem of elements has been raised to the independent (square) Cartesian product, the possible (maximum efficient combinatorial total yield is generated, because in this second operation the K
In each operation, 2 previously combined elements (as independent, specific solitaires) are dual-combined with each other.
The sequence of the dual combination of specific solitary elements follows the following scheme:
[Step I:] a to b = a,b
[Step II]: a,b to a,b(1) = [a,b], [a,b(1)]
(see appendix 9, follows asap)
For multiple operations that result from a corresponding number of elements selected as a specific base stem, this operation method (with continuous dual combination of the resulting elements, each of which is combined to form new, independent base stems) is repeated frequently.
The main effort (here when carrying out these operational steps by hand, which can also entail corresponding health risks; please be sure to note the disclaimer attached to this article!) lies in the summary and sorting of the resulting combinations of elements - if modern Office software with table document functions are used: the step of manually writing down the data output of combinations and thus using them as new, optimally usable solitary elements for creating basic databases cannot yet be avoided (to my current, but possibly expandable, knowledge): the manual one However, recording such elements has the advantage that a large part of the data effort can be reduced in advance. This, for example, if, with appropriate expertise, the arguments of the syntax of a selected specific language can be applied to the application of the principles described above: assuming an expert can state (with absolute certainty) about a selected specific language that certain specific combinations of, for example, phonemes or syllables are superfluous because they are superfluous in the sense of the (known) syntax of a selected specific language or would never be used, then generated basic stems of combinations (of e.g. phonemes or syllables) can be thinned out in advance: the syntax of a The specific language chosen (if reliably known) is one of the main arguments for saving data when applied to the described method of creating databases for e.g. Phonemes and syllable. (However, with appropriate experience, the copy and paste functions of a specific spreadsheet software can significantly reduce the overall burden on this procedure.)
DISCLAIMER (Haftungsausschluss):
The reason I'm posting this topic in this section is because a connection to archaeomathematics automatically arises. Due to the technical possibilities, it is not easy to clearly present the necessary connections in a simple manner. I'm trying to make the best of it.
Please note that any use of the content I have written on this topic, despite careful checking, is entirely at your own risk and free of any liability on the part of me as the author. Especially for the aspects of language research and mathematics, it should be emphasized that my contributions to this topic are not peer-reviewed and I do these aspects of research purely as a hobby: I am neither a qualified mathematician nor a language researcher and have come up with all the connections myself based on logical conclusions, systematic experimentation and a little work on the subject of reading. I would particularly like to point out that my own way of presenting connections (e.g. in formulas and logical argument chains) may differ from customary international agreements. One of the most important differences in this context is the fact that I use the spelling for factorials here in the forum with this exclamation mark in front of a factorial number - and not the other way around as is usual and internationally agreed).
Please note in particular that the connections I describe in this topic are not necessarily and automatically correct (or have to be) and, despite the most careful checking, can contain errors (e.g. also formatting errors and auto-correction errors as well as grammatical errors. The ones I describe here in the topic Overall, contexts are in no way suitable as a replacement for qualified teaching or as a certified or certifiable learning aid. The texts I have written on this topic have not been proofread by others.
However, the basic principle can theoretically be transferred to e.g. phoneme inventories with any number of elements and its limits are exclusively in technical or manual feasibility. Because I think it makes sense to understand and convey the basics, I do not focus on this topic as a whole deal with the possibilities of modern IT (e.g. special software and its possible programming): in this topic I am simply showing how relatively far-reaching results can be generated using the simplest means (manually and supported by office software in the form of spreadsheets).
I must expressly warn anyone who would like to deal with this topic in more detail at this point: creating complex spreadsheet documents - e.g. in handwritten form or in the form of computer documents - can lead to extreme and even health-damaging physical stress: some of these Stress can be, for example, physical and psychological fatigue, poor ergonomic posture and strain on the tendon sheath (see e.g.: "Mouse Arm"). Please therefore pay attention to the usual warnings and note that you, as the author of this topic, are responsible for any possible consequences of a recipient's engagement with the I assume no liability whatsoever for the issues described in this topic.
Any errors that may still exist in the author's contributions can be corrected by the author at any time and without prior notice and without special highlighting/marking. The author's contributions to this topic are continuously checked for errors and, if possible, corrected promptly by the author.
The author's contributions and discussions in this topic may (unintentionally) contain serious errors in logic, conclusions, calculations, formulations and, for example, grammatical and spelling errors, among other things. In particular, the use of the content and context published by the author in this topic with regard to copyright issues and - despite the most careful check - any existing issues in this regard is entirely at the author's own risk and risk and free from any liability on the part of the author.
In the event of any use of the content and context published by the author in this topic and any resulting consequences - including legal ones - it is under no circumstances the responsibility of the author to give or express any guarantees in advance that the author has published anything in this topic Despite careful examination, the contents are not subject to any copyright protection rights (also depending on the jurisdiction of a respective country and with general reference to international, stellar, galactic, interstellar and intergalactic law). This generally also applies to any future law.
In particular, the author assumes no liability for lost sales and profits with the content and context published in this topic.
Chain-operating data reduction in (square) Cartesian products
If data efficiency-optimizing numbers of combinations are generated in chain operations using (square) Cartesian products in square matrices, the following scenario arises with regard to the maximum efficient data yield:
From the tabular overview it is clear that the linear, mutually independent and non-chained data savings when applying the principle of the Cartesian product with increasing basic value (here x) in the corresponding proportional development always corresponds to a triangular number and reflects the series of triangular numbers.
[x] / [x²] / [datasavings (x² - x)/2] / [datasavings in %] / [Line]
[1] / [1] / [0] / [0.000 %] / [Line 1]
[2] / [4] / [1] / [25.000 %] / [Line 2]
[3] / [9] / [3] / [33.333p %] / [Line 3]
[4] / [16] / [6] / [37.500 %] / [Line 4]
[5] / [25] / [10] / [40.000 %] / [Line 5]
[6] / [36] / [15] / [41.666p %] / [Line 6]
[7] / [49] / [21] / [42.857... %] / [Line 7]
[8] / [64] / [28] / [43.750 %] / [Line 8]
[9] / [81] / [36] / [44.444p %] / [Line 9]
[10] / [100] / [45] / [45.000 %] / [Line 10]
[11] / [121] / [55] / [45.455... %] / [Line 11]
[12] / [144] / [66] / [45.833p %] / [Line 12]
[13] / [169] / [78] / [46.153... %] / [Line 13]
[14] / [196] / [91] / [46.428... %] / [Line 14]
[15] / [225] / [105] / [46.666p %] / [Line 15]
[16] / [256] / [120] / [46.875 %] / [Line 16]
How much data can be saved in connected and successive chain operations when applying the principle of the Cartesian product depends on how many placeholders (e.g. 20 phonemes) are used. To explain the principle, an example with 8 placeholders (e.g. 8 phonemes) is explained here: fictitiously assuming there is a language that has only 8 phonemes in its basic inventory, then 3 consecutive chain combination operations can be carried out according to the principle of the Cartesian product generate all possible phoneme combinations of such a fictitious language with the following restriction: none of the phonemes may appear in the output line more than (...follows).
Nonlinear chained operations using the Cartesian product principle (8 placeholders):
Requirements:
Basic phoneme inventory to combine (8 phonemes total, absolutely fictional):
Phonem 1 = BA
Phonem 2 = KO
Phonem 3 = JO
Phonem 4 = PAE
Phonem 5 = TOK
Phonem 6 = PAR
Phonem 7 = POI
Phonem 8 = PON
Processes to be carried out / STEPS:
For reasons of space, only the results can be listed here in the specific order chosen. The tabular overviews used to generate the Cartesian products are not shown here for reasons of space and due to the scope required. The data is first output as numerical codes and then transferred to phonetic databases, in which each digit is assigned the corresponding phoneme.
In the first combining step, each phoneme is combined with every other phoneme to form the resulting possible dual combinations (pairs):
STEP I:
Combining all placeholders into pairs in all possible combinations. The display of mirrored content is no longer necessary (use of the “Read Backwards” option; A slash separates the combinations; a double slash with a minus sign in between marks a specific sequence.)
results (36 combinations) as numbers:
1,1/2,1/3,1/4,1/5,1/6,1/7,1/8,1/-/
2,2/2,3/2,4/2,5/2,6/2,7/2,8/-/
3,3/4,3/5,3/6,3/7,3/8,3/-/
4,4/5,4/6,4/7,4/8,4/-/
5,5/6,5/7,5/8,5/-/
6,6/7,6/8,6/-/
7,7/8,7/-/
8,8
results (36 combinations) as phonems; transliterated:
BA,BA/KO,BA/JO,BA/PAE,BA/TOK,BA/PAR,BA/POI,BA/PON,BA/-/
KO,KO/JO,KO/PAE,KO/TOK,KO/PAR,KO/POI,KO/PON,KO/-/
JO,JO/PAE,JO/TOK,JO/PAR,JO/POI,JO/PON,JO/-/
PAE,PAE/TOK,PAE/PAR,PAE/POI,PAE/PON,PAE/-/
TOK,TOK/PAR,TOK/POI,TOK/PON,TOK/-/
PAR,PAR/POI,PAR/PON,PAR/-/
POI,POI/PON,POI/-/
PON,PON
(more content will follow soon)
STEP II:
From step 2 onwards, the advantages of modern electronic data processing and the use of spreadsheet calculation software begin to take effect: via copy & paste and using the option of spreadsheet calculation software, the work steps for determining all the resulting combinations, which would be extremely time-consuming to write by hand, can be carried out with the appropriate concept very short time (see appendix 10). In less than 1 hour, perhaps even in about half an hour, it is possible to determine all relevant combinations of 36X36 and create them as a spreadsheet.
Determination of the number of combinations by calculating the specific sum number (as a triangular number):
A simple but mathematically complex method, depending on the method used, to determine the number of possible combinations when using the "Read backwards" option when generating a Cartesian product (see data savings) is the following: for this, the specific sum of directly consecutive ones is simply added up Elements determined according to the principle of the series of triangular numbers and their development. This results in the corresponding number of combinations for 36X36 combinations = SUM(36) = {1+2+3+4+5+ ... + ... 32+33+34+35+36} = 666. With modern spreadsheet software, determining such totals is very easy and can be done in just a few seconds.
STEP III:
Step III is not presented here due to the scope and effort required. Elevating 666 elements to the 666X666 Cartesian product would yield a total of 443556 combinations, which when reduced by applying the Read Backward option would yield 222111 combinations.
Creating such a database using manual methods using conventional spreadsheet software can be accomplished in a relatively short period of time, perhaps even in a single day.
However, I would like to avoid this work at this point, as the main focus of this topic article is on the Sumerian language with a phoneme inventory of 20 phonemes, which requires a different approach when creating corresponding databases.
Summary:
This work may seem very laborious. However, if the individual work steps are optimized and, if necessary, transferred to many different people, it is realistically possible to create basic databases with, for example, numbers as placeholders that are suitable for transfer to databases with, for example, phonemes or syllables. Once this basic work is done, the basic databases do not need to be created again for any further action. Added to this are the modern possibilities of programmed, automated creation of such databases using software that is to be developed (or possibly already developed) specifically for this purpose.
Another advantage of using the method described above is to be able to better research the creation algorithms of the combination sequences that result in combined Cartesian products in order to determine simplifications for the development of combination series.
I also save myself here from having to verbalize the combination rows created in Step II with placeholders filled with numbers using the phonetic fantasy phonemes from the example mentioned above for illustration purposes. I think the basic principle of the possible procedure has now been sufficiently discussed and explained.
Possible procedure for creating a phoneme database with 20 placeholders as a basic inventory:
The overall procedure for creating a phoneme database with 20 placeholders as a basic inventory requires some preliminary considerations regarding the total effort involved: here the application of specific mathematics helps us in advance to avoid the worst errors and thus saves unnecessary work and time.
If we plan to create a complete, gapless phoneme database with 20 placeholders as a basic inventory, we must be aware in advance that the desired database is extremely large and requires a corresponding amount of effort to create: every logical error at the beginning will eventually take its toll over the course of the project and can take hours, destroy entire days, weeks and months of work. The basic mistakes that tend to creep in at the beginning of such an undertaking are best avoided by thinking through such a project in detail beforehand.
The main mathematical aspects that should definitely be considered before starting such a mammoth project are the following:
Determining the number of placeholders
The amount of data required later in the project depends on the preliminary determination of the number of placeholders. The data effort of such a project is specifically proportional to the number of predefined placeholders.
When determining in advance the number of placeholders and the question of what meaning should be assigned to them (e.g. phonemes, syllables), it is important to consider whether we can really be sure about how many phonemes or syllables a specific language that has been researched to date actually has. It is also essential that we are aware of the different effects that result from the application of the appropriate method with which we create such a database. Here, a distinction must be made between two basic methods in the context of my previous statements on this topic, namely, as already mentioned, between the method using the Cartesian product principle with the combined "reading backwards" option and/or the method I have described. "Telephone book method". I will discuss the “phone book method” separately in a follow-up post on this topic.
You must be aware of the following project scope when 20 placeholders are used to create databases by concatenating Cartesian products with respective intermediate result transfers when using the "read backwards" option. A crucial question is also how many placeholders of readable text lined up one after the other in a specific language (here in the example of Sumerian) will ultimately want to create. The preliminary clarification of the questions mentioned ultimately has a direct influence on which steps we decide on in the planned project.:
20 placeholders create basic inventory 20X20 = 400 combinations (without gaps) when combining 2 placeholders (basic combinations or basic combinations); With the "Read backwards" option, SUM(20) = 210 combinations are generated. The data savings when using the "Backwards read" option creates a data saving of 190 combinations in the 1st process step. The following tabular overview provides information about how this data effort series develops, if we actually aim to combine 20 placeholders with each other without any gaps to ultimately form 20 elements (placeholders) of readable text arranged in a row. The resulting superset (master set) of readable text that is output is directly mathematically dependent on the respective process step and the selected generation method:
Chain algorithmic generation of combinations with 20 placeholders basic inventory:
(Selected method: Data-efficient concatenation of Cartesian products without the "read backwards" option.)
x = number of placeholders (basic inventory)
data saving in steps; SUM(x) = [(x² - x) : 2]
AXB = generated Cartesian product
carry(x) = Carry for x in the next step; carry(x) = SUM(x)
[x] / [AXB]/ [x²] / [SUM(x)] / [carry(x)] / [line]
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[20] / [20X20]/ [400] / [SUM(20)] / [carry(210)] / [line1]
[210] / [210X210]/ [44100] / [SUM(210)] / [carry(22155)] / [line2]
[22155] / [22155X22155]/ [490844025] / [SUM(22155)] / [carry(245410935)] / [line3]
...
The small list quickly makes it clear what an enormous amount of data we would be confronted with in such a project. Therefore, the main focus of such a project should be on every possible measure that can reduce further data expenditure. One of the most essential ways to achieve this is fundamental knowledge of the syntax of a specific language. Researching the syntax of a specific language being researched in detail is, among other things, Indispensable for the reasons mentioned here of possible data savings in decryption attempts. More about the strategies of how knowledge about the syntax of a specific language can help us to draw conclusions about possible decryption attempts or complete mapping attempts using the methods already presented here and those still to be discussed in the following.
DISCLAIMER:
Any use of the content posted by the author in this topic is entirely at your own risk and free of any liability on the part of the author.
SOURCES / BIBLIOGRAPHY:
[books]:
du Sautoy, M.: Die Musik der Primzahlen - Auf den Spuren des größten Rätsels der Mathematik. 7. Aufl. ungek. Ausg., Verlag dtv Wissen, München, 2013
Reiss / Schmieder: Basiswissen Zahlentheorie - eine Einführung (Reihe: Mathematik für das Lehramt) - Eine Einführung in Zahlen und Zahlbereiche. 2. Aufl. Verlag Springer, Berlin, Heidelberg, New York, 2007.
Schmidt / Trenkler: Moderne Matrix-Algebra. Mit Anwendungen in der Statistik. Verlag Springer, Berlin, Heidelberg, New York, 1998.
[gWiki; german-language Wikipedia]:
[gWiki1]:
Bibliografische Angaben für „Sumerische Sprache“
Seitentitel: Sumerische Sprache
Herausgeber: Wikipedia – Die freie Enzyklopädie.
Autor(en): Wikipedia-Autoren, siehe Versionsgeschichte
Datum der letzten Bearbeitung: 9. September 2023, 21:00 UTC
Versions-ID der Seite: 237177728
Permanentlink: https://de.wikipedia.org/w/index.php?ti ... =237177728
Datum des Abrufs: 1. Januar 2024, 11:03 UTC
[gWiki2]:
Bibliografische Angaben für „Georg Cantor“
Seitentitel: Georg Cantor
Herausgeber: Wikipedia – Die freie Enzyklopädie.
Autor(en): Wikipedia-Autoren, siehe Versionsgeschichte
Datum der letzten Bearbeitung: 14. Dezember 2023, 12:47 UTC
Versions-ID der Seite: 240179661
Permanentlink: https://de.wikipedia.org/w/index.php?ti ... =240179661
Datum des Abrufs: 7. Januar 2024, 11:43 UTC
[gWiki3]:
Bibliografische Angaben für „Carl Friedrich Gauß“
Seitentitel: Carl Friedrich Gauß
Herausgeber: Wikipedia – Die freie Enzyklopädie.
Autor(en): Wikipedia-Autoren, siehe Versionsgeschichte
Datum der letzten Bearbeitung: 23. Dezember 2023, 07:46 UTC
Versions-ID der Seite: 240453755
Permanentlink: https://de.wikipedia.org/w/index.php?ti ... =240453755
Datum des Abrufs: 7. Januar 2024, 15:10 UTC