280 likes | 294 Views
This presentation explores the figures, trends, and analysis of languages in the Internet, focusing on indicators such as power, capacity, gradient, and productivity. The methodology used to derive these indicators is discussed, along with the results and trends for different languages.
E N D
Daniel Pimienta pimienta@funredes.org Observatory of languages & cultures in the Internet http://funredes.org/lc ExecutiveCommitteeMember of http://maaya.org
Preservation of Languages and Development of Linguistic Diversity in Cyberspace: Context. Policies. Practices 1 July - 5 July. 2019 Yakutsk - Russian Federation
Languages in the Internet: figures and trends aytte Daniel Pimienta pimienta@funredes.org MAAYA
WARNING Unfortunately the method does not allow to offer results for languages below the millions of speakers. At this stage we will focus on languages With more than 5 million speakers.
BACKGROUND - ANTECEDENTS In Global Expert Meeting Multilingualism in Cyberspace for Inclusive Sustainable Development. June 2017. Khanty-Mansiysk. Russian Federation I presented : Linguistic Indicators in cyberspace: the biases is all and I announced An alternative approach to produce indicators of languages in the Internet. June 2017 First presented as Towards a multilingual cyberspace. an online Journal to promote language in cyberspace, in World Humanities Conference, Liège,August 2017 and in Séminaire méthodologique sur le classement des langues, OIF, Paris, 27-28 juin 2018
THE CONCEPT • Use as much and as varied as possible data figures from the Internet to approximate the space of languages. • Try to reach a large scope of applications and spaces of the Internet.
THE CONCEPT • Complete scarce linguistic figures by more easy to find country figures. • Transform country figures into languages figures.
THE CONCEPT • Use simple statistics tricks to derive meaningful indicators. • Pay careful attention to all possible biases.
AND PRODUCE • FOR THE 140 languages with more than 5 millions speakers. • The following indicators per language:
6 indicators 369 micro-indicators 4 MACRO-INDICATORS % 1 Internauts 13 % CAPACITY Contents % % 11 Usages POWER % 316 Trafic GRADIENT 23 % Interfaces PRODUCTIVITY % 5 Indexes
A paper and a presentation describing the methodology are available in http://funredes.org/lc2017/ TODAY WE WILL NOT FOCUS ON THE METHOD BUT ON RESULTS. HOWEVER QUESTIONS ABOUT THE METHOD ARE WELCOMED
RESULTS STAND ON TRIPOD DATA • % of persons connected • per country • Updated each year • Quite reliable • Most sensitive data • Free • Relatively fast changes RESULTS STATS GRINDER • Internet data per • language or country • Heavy work • Permanent watch • Time consuming I T U • L1 Language speakers • per country • Ethnologue or Yoshua • Expensive • Generally slow changes LARGE SET OF MICRO-INDICATORS DEMOLINGUISTIC DATA
MEASUREMENTS Lack of resources obliges to update only one of the three feet. however the most significant one.
TOP LANGUAGES IN THE INTERNET:BIAS CORRECTED FIGURE 2017 C ON T E N T TOP 2 CHALLENGERS SOLID IMPORTANT NOTABLE
LET’S NOW SEE THE RESULTS AND TRENDS FOR EACH MACRO-INDICATOR
MACRO-INDICATORS RELATION POWER (of language in the Net) CAPACITY (of speakers) GRADIENT (of connected speakers)
MACRO-INDICATORS RELATION • Power of a given language (%) = capacity of this language in the Internet x % world population of speakers • Power of a given language (%) = gradient of this language in the Internet x % world connected population of speakers
WHAT IS MEASURED BYMACRO-INDICATORS? CAPACITY measures the vitality in the Internet of the language speakers. independently of their numbers. POWER measures the percentage of global presence of the Language in the Internet, directly linked to its number of speakers. GRADIENT measures the vitality in the Internet of the connected speakers of each language, independently of their numbers. PRODUCTIVITY is related to content production and is also independent of the number of speakers. CAPACITY % POWER GRADIENT PRODUCTIVITY
THE TOP 15 IN CAPACITY Minor changes Most always negative Due to the lack of update of the 2 other legs of micro-indicators
THE TOP 15 IN GRADIENT Minor changes Most always positive Due to the lack of update of the 2 other legs of micro-indicators
THE TOP 15 IN POWER GROWTH • They are % x 1000 • (0,8 means 0,0008%) • This obviously reflect in a strong • growth of ITU figures for connected • people • Capacity shows the same growth • because no demographic update • Gradient shows low change AS =Asian language AFF = Francophone Africa language AFA = Anglophone African language AF = African language
THE BOTTOM 15 IN POWER GROWTH • This obviously reflect in ITU figures • for connected people • Capacity shows the same growth • because no demographic update • Gradient shows low change AS =Asian language AFF = Francophone Africa language AFA = Anglophone African language AF = African language
Languages spoken in francophone Africa appear both in the top and in the bottom. What happens here is linked to the Internet performance of the countries where those languages are spoken. Let’s have a closer look…
ITU DATA FOR FRANCOPHONEAFRICA In redlowconnectivty & lowgowth In green lowconnectivity but stronggrowth In blue country in good shape In black average situation
A ZOOM ON FRENCH French hold a solid 4th place in the Internet, partly due to its voluntary language policies and partly due to the fact it has one of the higher L2 population in proportion of native speakers. The long term future of French on the Internet is promising due to its growing speaker’s population in Africa which may be capitalized when the digital divide is overcome en Africa. Meanwhile a slight decrease may keep occurring.
CONCLUSION The field of languages in the Internet is still open for positive changes. This type of data is required to conduct appropriate policies and this field should not be left to marketing forces prone to misinformation. It is almost as difficult to find serious method for evaluating the space of languages in the Internet as to find budget to conduce the studies.