Scientific metrics on bibliometric studies: detection of outliers for univariate data

Authors

  • Luís Fernando Maia Lima Fundação Universidade Federal de Rondônia, Departamento de Economia.
  • Alexandre Masson Maroldi Fundação Universidade Federal de Rondônia, Departamento de Biblioteconomia.
  • Dávilla Vieira Odízio da Silva Instituto Federal do Amazonas Campus Lábrea
  • Carlos Roberto Massao Hayashi Universidade Federal de São Carlos Departamento de Ciência da Informação
  • Maria Cristina Piumbato Innocentini Hayashi Universidade Federal de São Carlos Departamento de Ciência da Informação

DOI:

https://doi.org/10.19132/1808-5245230.254-273

Keywords:

Outliers. Exploratory Data Analysis. Asymmetry. Bibliometry. Univariate.

Abstract

This study presents formulas for detection of outliers for univariate data, taking into consideration the positive as well as the negative asymmetry of data. This new formula is based on the Exploratory Data Analysis and is simulated through the comparison of the outcome of the Exploratory Data Analysis found in statistical text books and statistical software. However, only normal or Gaussian distribution, i.e., symmetric or slightly asymmetric values, are applied. Real data published in two scientific papers on metrics are used for the simulation. For moderate or strong positive (negative) asymmetries, the new formulation detects a lower (higher) quantity of superior outliers. It is important to take into account the existence of outliers in bibliometric data; it is recommended to quantify the influence of outliers in statistical calculation, such as mean and standard deviation.

Downloads

Download data is not yet available.

Author Biographies

Luís Fernando Maia Lima, Fundação Universidade Federal de Rondônia, Departamento de Economia.

Doutor em Engenharia Civil (USP).

Alexandre Masson Maroldi, Fundação Universidade Federal de Rondônia, Departamento de Biblioteconomia.

Doutorando em Educação (UFSCAR).

Dávilla Vieira Odízio da Silva, Instituto Federal do Amazonas Campus Lábrea

Graduada em Biblioteconomia.

Carlos Roberto Massao Hayashi, Universidade Federal de São Carlos Departamento de Ciência da Informação

Doutor em Educação (UFSCAR).

Maria Cristina Piumbato Innocentini Hayashi, Universidade Federal de São Carlos Departamento de Ciência da Informação

Doutora em Educação (UFSCAR) e Bolsista em Produtividade em Pesquisa CNPQ.

References

ADIL, Iftikhar Hussain; IRSHAD, Ateeq ur Rehman. A modified approach for detection of outliers. Pakistan Journal of Statistics and Operation Research, Lahore, v. 11, n. 1, p. 91-102, Apr. 2015.

BANERJEE, Sharmila; IGLEWICZ, Boris. A simple univariate outlier identification procedure designed for large samples. Communications in Statistics: simulation and computation, New York, v. 36, n. 2, p. 249-263, Mar. 2007.

BARNETT, Vic; LEWIS, Toby. Outliers in statistical data. 3. ed. New York: John Wiley & Sons, 1994.

BENSMAN, Stephen J.; SMOLINSKY, Lawrence J.; PUDOVKIN, Alexander I. Mean citation rate per article in Mathematics journals: differences from the scientific model. Journal of the American Society for Information Science and Technology, New York, v. 61, n. 7, p. 1440-1463, July 2010.

BORNMANN, Lutz et al. Citation counts for research evaluation: Standards of good practice for analyzing bibliometric data and presenting and interpreting results. Ethics in Science and Environmental Politics, Oldendorf/Luhe, v. 8, p. 93-102, 2008. Disponível em: <http://www.int-res.com/articles/esep2008/8/e008p093.pdf>. Acesso em: 5 set. 2016.

BRANT, Rollin. Comparing classical and resistant outlier rules. Journal of the American Statistical Association, Boston, v. 85, n. 412, p. 1083-1090, Dec. 1990.

BRUFFAERTS, Christopher; VERARDI, Vincenzo; VERMANDELE, Catherine. A generalized boxplot for skewed and heavy-tailed distributions. Statistics and Probability Letters, Amsterdam, v. 95, p. 110-117, Dec. 2014.

CARLING, Kenneth. Resistant outlier rules and the non-Gaussian case. Computational statistics & Data Analysis, Amsterdam, v. 33, n. 3, p. 249-258, May. 2000.

CARTER, Nancy; SCHWERTMAN, Neil C.; KISER, Terry L. A comparison of two boxplot methods for detecting univariate outliers which adjust for sample size and asymmetry. Statistical Methodology, Amsterdam, v. 6, n. 6, p. 604-621, Nov. 2009.

DOVOEDO, Y. H.; CHAKRABORTI, S. Boxplot-based outlier detection for the location-scale family. Communications in Statistics – Simulation and Computation, New York, v. 44, n. 6, p. 1492-1513, Apr. 2015.

GLÄNZEL, Wolfgang; MOED, Henk. F. Thougts and facts on bibliometric indicators. Scientometrics, Dordrecht, v. 96, n. 1, p. 381-394, Jul. 2013.

HOAGLIN, David C.; IGLEWICZ, Boris. Fine-tuning some resistant rules for outlier labeling. Journal of the American Statistical Association, Boston, v. 82, n. 400, p. 1147-1149, Dec. 1987.

HOAGLIN, David C.; IGLEWICZ, Boris; TUKEY, John W. Performance of some resistant rules for outlier labeling. Journal of the American Statistical Association, Boston, v. 81, n. 396, p. 991-999, Dec. 1986.

HUBERT, M.; VANDERVIEREN, E. An adjusted boxplot for skewed distributions. Computational Statistics & Data Analysis, Amsterdam, v. 52, n. 12, p. 5186-5201, aug. 2008.

KIMBER, A. C. Exploratory data analysis for possibly censored data from skewed distributions. Journal of the Royal Statistical Society. Series C (Applied Statistics), London, v. 39, n. 1, p. 21-30, Jan. 1990.

LIMA, Luís Fernando Maia Lima; MAROLDI, Alexandre Masson; SILVA, Dávilla Vieira Odízio da. Outlier(s) em cálculos bibliométricos: primeiras aproximações. Liinc em Revista, Rio de Janeiro, v. 9, n. 1, p. 257-268, maio 2013.

MUTZ, Rüdiger; DANIEL, Hans-Dieter. Skewed citation distributions and bias factors: solutions to two core problems with the journal impact factor. Journal of Informetrics, Amsterdam, v. 6, n. 2, p. 169-176, Apr. 2012.

SANTOS, Solange Maria dos. Perfil dos periódicos científicos de Ciências Sociais e Humanidades: mapeamento das características extrínsecas. 2010. 176 f. Dissertação (Mestrado em Ciência da Informação) – Escola de Comunicação e Artes, Universidade de São Paulo, São Paulo, 2010.

SCHWERTMAN, Neil C.; OWENS, Margaret Ann; ADNAN, Robiah. A simple more general boxplot method for identifying outliers. Computational Statistics & Data Analysis, Amsterdam, v. 47, n. 1, p. 165-174, Aug. 2004.

SCHWERTMAN, Neil C.; SILVA, Rapti de. Identifying outliers with sequencial fences. Computational Statistics & Data Analysis, Amsterdam, v. 51, n. 8, p. 3800-3810, May 2007.

SILVA, Dávilla Vieira Odízio da. Elementos bibliométricos das referências nas dissertações defendidas no Programa de Mestrado de Biologia Experimental (PGBIOEXP) na Universidade Federal de Rondônia (UNIR), entre 2003 a 2010. 2014. 51 f. Trabalho de Conclusão de Curso (Graduação) – Departamento de Ciência da Informação, Universidade Federal de Rondônia, Porto Velho, 2014.

SILVA, Ermes Medeiros da; et al. Estatística para os cursos de Economia, Administração, Ciências Contábeis. 2. ed. São Paulo: Saraiva, 1996. v. 1.

SIM, C. H.; GAN, F. F.; CHANG, T. C. Outlier labeling with boxplot procedures. Journal of the American Statistical Association, Boston, v. 100, n. 470, p. 642-652, Jun. 2005.

TRIOLA, Mario F. Introdução à Estatística. 10. ed. Rio de Janeiro: LTC, 2012.

TUKEY, John Wilder. Exploratory data analysis. Reading, Massachusetts: Addison-Wesley, 1977.

Published

2017-01-27

How to Cite

MAIA LIMA, Luís Fernando; MAROLDI, Alexandre Masson; SILVA, Dávilla Vieira Odízio da; HAYASHI, Carlos Roberto Massao; HAYASHI, Maria Cristina Piumbato Innocentini. Scientific metrics on bibliometric studies: detection of outliers for univariate data. Em Questão, Porto Alegre, v. 23, p. 254–273, 2017. DOI: 10.19132/1808-5245230.254-273. Disponível em: https://seer.ufrgs.br/index.php/EmQuestao/article/view/68030. Acesso em: 11 aug. 2025.

Most read articles by the same author(s)

Similar Articles

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 > >> 

You may also start an advanced similarity search for this article.