A hierarchy of limitations in machine learning: Data biases and the social sciences

Webinar Series: we are back 😉

29 September – 16-18hs (GMT +2)

Registrations now Open.

We are definitely back after a long and strange summe,r that has seen us trying to clear our minds in the context of a second wave of pandemic in Europe.

This webinar series was born in the pre-COVID19 era and yet today the central theme through it, that is, the practices and discourses around digital data in higher education, is gaining momentum. Well, as Ben Williamson opportunely showed, the massive use of data in higher education has been linked to successful discourses on the power of technologies to respond to the many dilemmas posed by emergency remote education. However, as he himself alleged, the goals of tech solutionism can turn obscure. In fact, going beyond a naïve hope to repair the immediate suffering caused by interrupted school and university life, the unprecedented scale of digital platforms usage place ourselves in the Big Data economy and its digital behaviorism. Here, a mass of clicks forced by the pandemic could be quickly converted into “data points” for the monetization of educational services based on algorithmic models. Machine Learning.

Before the pandemic, it appeared appropriate and even necessary to start posing questions with regard to the inherent processes of this nascent computational technique, which makes a particular use of statistics and programming. Winds of concern, from the work of Safiya Noble, Virginia Eubanks and Shoshana Zuboff, led social scientists to try and interrogate the feasibility these statistical and computational procedures could have for capturing the complex reality of social phenomena. Indeed, it turned out that in many cases machine learning would have replaced interpretive and participatory methodologies typical of the social sciences as a basis for recommending decision-making in social-health, educational, and judicial contexts.

In such context, throughout this Webinar space, we reinforced the idea of the need for an interdisciplinary dialogue to dig into the relationship between technological development and society, between “Psiche and Techne” (borrowing the name of that relationship given by Umberto Galimberti). Searching for a more balanced a constructive tension. In the current state of things, the wonder of “possibility” attached to the continuing development of artificial intelligence, has obscured the reflection about the “opportunity” and especially the social “need” of such technological developments.

In this direction, we spoke with several experts in critical studies of technology, from media education to educational technology, to law. But the conversation could not be approached with the necessary sharpness if we did not invite those who work, with sensitivity and openness, in the computational field.

Momin M. Malik, who will reach us at our roundtable tomorrow, brings us much more than that. His webinar “A Hierarchy of Limitations in Machine Learning: Data Biases and Social Sciences” presents the journey of a researcher that has worked hard to bridge the two far continents of social science and computer science.

Momin Malik’s article, which originated this webinar proposal “A Hierarchy of Limitations in Machine Learning,” clearly represents his complex thinking. I read his article as someone who prepares herself for a journey of progressive magnification of observation lenses, as if we could observe an object on earth from the celestial macro-dimension to finish by the exploration of its molecular structure. Indeed, although it is not an explicit objective in Malik’s work, I did have the feeling of traveling from the “macro” phenomenological and methodological levels, that is, the perceptible and interpretive problem of the social sciences at the “micro” levels of the formulation algorithms and their statistical and mathematical structures. In my view, each of these levels includes both an object and a series of methods for its analysis, and excludes others. However, each level can be supported by the conclusions generated by the other. Figure 1, extracted from Malik’s article, is eloquent in this regard.

Figura 1 – A diagram of the hierarchy explained in Malik’s article.

The four elements of the hierarchy are the methodological framework (1) Inquiry, Qualitative or Quantitative; the method (2) Mathematical or based on probability;the instruments (3), namely, type of probability adopted, explanatory or predictive; (4) the instruments adjusting instruments, namely, the probability validation approach (validation outside the selected sample to generate the model, or cross-validation, using the same sample values).

Malik’s work is based precisely on the criticisms made from anthropology, sociology, science and technology studies as well as statistics to machine learning procedures and the relating enthusiasm for their applicative potentiality. However, I found amazing that a same researcher, and for the first time, raise all these criticisms in a complete and exhaustive sequence (from 1 to 4).

Malik warns us, at the beginning of the reading journey: two audiences are foreseen for his work.

Those who come from the computational and statistical / mathematical sciences will find a comprehensive review of the deepest assumptions of modeling and the intrinsic limitations induced by these assumptions, as well as some examples in which these assumptions are broken down to show concrete consequences.

The second type of audience is, of course, the social sciences. Among them are consumers of services based on machine learning: let’s think of the educational case, teachers who use learning analytics dashboards; or who deploy institutional policies based on these data. Malik offers his analysis so who operate at these levels can better understand the contingencies or implications of machine learning, with the ultimate goal of helping them to decide if, where and how to adopt machine learning. One of Malik’s concerns is to support social scientists and model consumers to make objections to machine learning in a specific way, better identifying points they might disagree on. And this has to do, considering the hierarchy, with the interpretive and conceptual assumptions of a social phenomenon (for example, performance in learning processes, or learning itself); what are called “data points” with all the ethical issues associated with their definition as well as the methods of collecting, cleaning and organizing data (for example, if it is sensitive student data); the modalities of models validation used and the behaviors they might trigger (eg, using predictive models to “diagnose” future learning skills).

While the criticisms his work introduces at levels 1 or 2 may not be new to social scientists working with interpretive methods (level 1) or statistical methods applied to social sciences (2), levels 3 and 4 open up a world for them (that’s me!). Likewise, levels 3 and 4 are of crucial importance to invite reflection on problems more or less known to those who work from data / computer science. But a vision of levels 1 and 2, would stimulate a reflection that, in my opinion, would generate more awareness in relation to much of the excessive simplification of the modeled phenomena nowadays … inviting to involve users, as well as social scientists and humanists to work on models.

Malik offers some original ideas, such as the “optimism” approach as a simple but novel extension of statistical theory to deal with the problem of cross-validation bias. He argues that this could be an elegant unifying framework for multiple disconnected efforts on how to think about the proper structuring of progressive data divisions. And here, he clearly speaks out to data scientists.

The architecture of an interdisciplinary bridge

We engaged in a conversation to prepare this webinar with Malik, which was rich and enlightening from various points of view. The complexity of Malik’s ideas led me to explore topics such as the paradigm war of the 1980s-1990s where qualitative research sought a space for recognition, towards a critique of mixed methods in the validation of machine learning models. One was the problem of defining constructs instantiated through the problem of race-blind models as fair machine learning, and particularly in the case of the learning analytics system, building on Buckinham-Shum post, more on the side of computational science, to which followed the provocation opened by Carlo Perrota, more on the side of the social sciences. Malik expanded this discussion to a question of methods in the way that the variable “race” could be constructed, pointing to Lily Hu’s interesting post on creating a complex and counter-factual approach.

The dimension acquired by the phenomenon of dataification (and more than anything the commodification of data) was unthinkable until February 2020. Until then, the advance of the automation of educational services and processes had been consolidated, in any case, in an aggressive way. However, during the pandemic this phenomenon found its best chance to test itself. And the results have gone in the direction of confirming the grayer premonitions about machine learning biases: for example, the terrible mistake of using such a system to assign end-of-course grades in the UK, as Neil Selwyn well points out. Bias that has obviously cost more than one protest march and surely a post of Minister of Education…

Tomorrow Malik will answer questions, which surely, as in a garden of forking paths, will lead us to other new curiosities:

  • In a recent systematic review, you explore the issues connected to machine learning through the lens of a 4 decision scheme. It appears to me that you bridged the epistemologies of social and computer sciences to arrive at such an interesting scheme of reflection. Could you explain to us what your background is and how you came up with the 4 decision scheme?
  • The quantitative vs. qualitative conflict played out in a “paradigm war” during the ’80s. In particular, qualitative researchers worked out their methodological instruments to provide a solid basis for their methodological approach. How has it changed in the AI era? Why are we sticking to quantitative logics more than ever nowadays?
  • You say that it’s necessary to take up responsibility for quantification. What do you mean? What does this entail for social research at the crossover with computer science?
  • Could you provide examples of bad measuring in machine learning? What do you mean with “Performativity” or applying the logic of the models on the world?
  • You say “prediction is not explanation”, and the problems of errors in machine learning. How do you imagine the field could advance to work out the problems entailed with prediction?
  • Correlations are already a matter of suspicion. They were already in the ’50s when Darren Huff wrote his book “How to lie with statistics” and he tried an approach to support basic statistical literacy. Why do we still stick to correlations?
  • Assessing model performance is a major topic in AI, yet most people are unaware of such procedures. If it’s done by a machine, then it’s rigorous. Which are the consequences?
  • Let’s now imagine the future of machine learning into social research, as well as the contribution of social research to machine learning: which are your concerns, which are your sources of hope? Could you imagine, for a moment, the consequences in undergraduate social sciences teaching, as well as the possible collaborations/cross-fertilizations with computer science training? 

Without a doubt, we will have to work hard to cross the interdisciplinary bridge that Malik strategically architected. And so he invites us to start the journey:

In terms of alternatives, my work suggests the following tasks:
• Systematically incorporate qualitative methods;
• Seek ways of including variability, or higher moments, as one way of mitigating a reliance only on the central tendency corresponding to the rst moment;
• Split data for cross-validation in ways that take dependencies into account;
• Do out-of-sample, real-world testing before making strong claims about model performance (and, conversely, demand such testing before accepting strong claims);
Be aware, open, and humble about the limitations of machine learning modeling. That is, machine learning modelers should also consider how some tasks might be better addressed

Momin M. Malik, A hierarchy of limitations in Machine Learning, 2020, p.46

Returning to Galimberti, the technique (and in this sense the machine learning technique) could end up depriving man of any anticipatory possibility of his human becoming, by replacing the responsibility and authority that derives from explicitly and consciously proposing a future. The feeling of human inadequacy for this important task (deciding own future) finds in technology a kind of subterfuge, the “automatic” and the “objective” way which resolve the daily decision about human fate.

Malik’s work surely invites us to return to human centrality, leaving aside the blindness caused by the “lights” of the technological environment, while digging stubbornly on the ways technology still would serve its purpose as mean to an end.

About our Guest Speaker

Momin M. Malik is a multidisciplinary researcher who brings statistical modeling to bear on critical, reflexive questions with and about large-scale digital trace data. He is broadly concerned with issues of algorithmic power and control, and of validity and rigor in computational social science. In addition to empirical work modeling social media and mobile phone sensor data, he works on how to understand statistics, machine learning, and data science from critical and constructivist perspectives, on ethical and policy implications of predictive modeling, and on understanding and communicating foundational problems in statistical models of social networks. He has an undergraduate degree in history of science from Harvard, a master’s from the Oxford Internet Institute, and a PhD from Carnegie Mellon University’s School of Computer Science.

Published by jraffaghelli

Research professor at University of Padua, Department of Philosophy, Sociology, Pedagogy and Applied Psychology. Former Ramon y Cajal researcher at the Faculty of Education and Psychology (Universitat Oberta de Catalunya). PhD in Education and Cognitive Science and Master in Adult Education (Ca' Foscari University of Venice) Degree in Psychology (University of Buenos Aires).

One thought on “A hierarchy of limitations in machine learning: Data biases and the social sciences

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: