Title VII

Posted on October 5, 2017 by Sarah Shugars

Yesterday, Attorney General Jeff Sessions sent a memo to agency heads and US attorneys. Obtained by BuzzFeed, the memo read in part:

Title VII’s prohibition on sex discrimination encompasses discrimination between men and women but does not encompass discrimination based on gender identity per se, including transgender status.

In other words, while federal law previously recognized that Title VII of the 1964 Civil Rights Act applied to all forms of gender discrimination, the Justice Department is now choosing to interpret the law more narrowly; no longer protecting all men and women from employer, voting, or other forms of public discrimination. Specifically, transgender men and women will no longer have these federal protections.

Currently, only 20 states plus the District of Columbia have laws prohibiting discrimination against transgender individuals. So with the distribution of a memo, three fifths of our nation’s citizens just lost human rights protections which they had the day before.

It shouldn’t be that easy to take away basic rights.

Additionally, the removal of federal backing puts existing state laws into greater peril, as opponents are already mobilized trying to over turn state laws. Even more of our citizens could lose their rights.

This unconscionable act cannot go unchallenged.

In his memo, Session tries to sound nonchalant about stripping citizens of their rights. The new Justice Department interpretation is “a conclusion of law, not policy. As a law enforcement agency, the Department of Justice must interpret Title VII as written by Congress.”

Scapegoating Congress for this egregious interpretation, however, flies in the face of existing case law and existing understanding Title VII protections.

Sessions attempts to appear all innocent and neutral in re-interpreting this law:

The Justice Department must and will continue to affirm the dignity of all people, including transgender individuals. Nothing in this memorandum should be construed to condone mistreatment on the basis of gender identity, or to express a policy view on whether Congress should amend Title VII to provide different or additional protections.

But it’s not neutral to roll back protections which have been in place, and it’s not innocent to explicitly remove protections covering transgender individuals. Despite his claims to the contrary, this is a policy move, not a legal one.

Furthermore this move comes just days after the United States took a position – as just one of 13 countries – against a UN resolution condemning the death penalty as a sanction for same-sex relations.

Yes, you read that right – we are no longer against the state-sanction murder of people for being gay.

The current administration is waging a war against human rights on many fronts. I know we are tired, we are exhausted and numbed from the constant stream of negative news. But we cannot allow these policy changes to pass silently or without confrontation.

We cannot let it be okay to simply re-interpret someone’s rights away.

Automated Methods for Identifying Civilians Killed by Police

Posted on October 4, 2017 by Sarah Shugars

I recently read Keith et al’s excellent paper, Identifying civilians killed by police with distantly supervised entity-event extraction, which was presented this year at the conference on Empirical Methods on Natural Language Processing, or, as it’s more commonly known, EMNLP.

The authors present an initial framework for tackling an important real world question: how can you automatically extract from a news corpus the names of civilians killed by police officers? Their study focuses on the U.S. context, where there are no complete federal records of such killings.

Filling this gap, human rights organizations and journalists have attempted to compile such a list through the arduous – and emotionally draining – task of reading millions of news articles to identify victim names and event details.

Given the salience of this problem, a Keith et al set out to develop a more streamlined solution.

The event-extraction problem is furthermore an interesting NLP challenge in itself – there are non-trivial disambiguation problems as well as semantic variability around indicators of civilians killed by police. Common false positives in their automated approaches, for example, include officers killed in the line of duty and non-fatal encounters.

Their approach relies on distant supervision – using existing databases of civilians killed as mention-level labels. They implement this labeling with both “hard” and a “soft” assumption models. The hard labeling assumes that every mention of a person (name and location) from the gold-standard database corresponds to a mention of a police killing. This assumption proves to be too hard and an inaccurate model of the textual input.

The “soft” models perform better. Rather than assume that every relevant sentence corresponds to a mention of a police killing, soft models assume that at least one of the sentences do. That is, if you take all the sentence in the corpus which mention an individual known to have been killed by police, at least one of those sentences directly conveys information of the killing.

Intuitively, this makes sense – while the hard assumption takes every mention of Alton Sterling, Michael Brown, or Philandro Castile to occur in a sentence mentioning a police killing, we know from simply reading the news that some of those sentence will talk about their lives, their families, or the larger context in which their killing took place.

For both assumptions, Keith et al compare performance between a convolutional neural net and a linear regression model – ultimately finding that the regression, with the soft assumption, out performs the neural net.

There’s plenty of room for improvement and future work on their model, but overall, this paper presents a clever NLP application to a critical, real world problem. It’s a great example of the broad and important impact NLP approaches can have.

White Space

Posted on October 3, 2017 by Sarah Shugars

This weekend, I had the opportunity to attend a rich discussion hosted by The Welcome Project with local author Jennifer De Leon. The conversation focused on De Leon’s 2013 short story The White Space.

While helping her father put together his first résumé, the U.S.-born De Leon writes:

Without cell phone or fax numbers, email or website addresses, the top of the page looks lonely. Where do I write that my father grew up along the southern coast of Guatemala, where his father worked for the U.S.-owned United Fruit Company (UFC), which helped kick Communism to the world curb while pretending to care about Guatemalan citizens’ intake of bananas? They were only interested in profits and maintaining a capitalist economy.

…On my own résumés over the last ten years, phrases like terminal degree, academic honors, and double major are arranged nearly under the canopy of this section. But I can’t use any of these terms here. My father was denied the opportunity to complete secondary school in Guatemala because he needed to help support his brothers and sisters. Instead he plucked feathers off dead chickens in a small factory in Guatemala City from the time he was 14 years old.

…So tonight, as I help my father write his first résumé, I struggle to find words to fill this white space.

There is much in De Leon’s story which would resonate with any adult child: that feeling that you don’t really know your parents the way you might know a friend; that there is something intangibly distant about their experiences; that they lived in and were shaped by a world which ceased to exist before you were born; that the rich texture of their experience will always be beyond your grasp.

There is much in her story which would resonate with any first-generation to college student: feeling that vast void which palpably disconnects generational experience; realizing the values and norms you so blithely take for granted can seem foreign and obscure; coming to the inescapable conclusion that those same norms glibly dismiss the experiences of people whom you know to have real value.

And, as De Leon and others discussed this weekend, there is much in her story which resonates broadly with children of immigrants: feeling the generational and cultural divide even more sharply; feeling ashamed at your lack of fluency in your parent’s language; feeling like you’re torn between selves, between worlds, between identities.

Feeling like nothing you can do will ever make up for the sacrifice your parents made on your behalf.

In reflecting on these all these interwoven, sweet and painful complications, De Leon concluded:

“Like most beautiful things in life, it’s not so simple. I just do my best.”

Crime and Hate

Posted on September 29, 2017 by Sarah Shugars

Ally Lee Steinfeld had been missing since early September. Her body was found recently, mutilated and burned. She was 17.

Her death made Steinfeld at least the 21st transgender person killed in the United States this year. A record high of 22 murders were captured by the Human Rights Campaign last year.

We have to do better.

Steinfeld’s case is not being pursued as a hate crime. The sheriff overseeing the case told the Associated Press: “You don’t kill someone if you don’t have hate in your heart. But no, it’s not a hate crime.” That talking point was echoed by the prosecutor in the case, who told Time: “I would say murder in the first-degree is all that matters. That is a hate crime in itself.”

Perhaps this is accurate in a practical sense – in Missouri, where the crime took place, first-degree murder is punishable by execution or life imprisonment. A hate crime charge would be unlikely to add penalty.

Such comments, however, miss the point. A woman is dead. We have to do better.

Some advocates have even started to question whether hate crimes prosecution is an effective strategy. As one ACLU lawyer put it, “I worry that what hate crime laws do is narrow our focus on certain types of individual violence while absolving the entire system that generates the violence.”

And that’s the thing – it is a problem with the entire system. We are all culpable in perpetuating the gross transphobia of our society – through violent transphobic acts, through subtle jokes and misgendering, or by being complicit through silence while such hateful acts take place.

We have to do better.

Personally, I’m not prepared to abandon hate crime legislation just yet – whether adding to a punishment or not, ignoring the hate of a crime seems to implicitly indicate that while the crime may be punishable, the hate itself is sanctioned. But I’ve met a lot of good, smart lawyers who tell me that sometimes you have to sacrifice framing in the legal system – you go for the toughest penalty you can go for.

I do not know whether we can best accomplish our work through hate crime legislation or through other modes of advocacy. I only know that we have to do better.

We tell young women that they can be anything, that they can do anything. That they should shut down the haters and embrace their true selves. We tell women that it is their right in the 21st century to be the person they want to be. We tell them this is America. We tell them they are free.

Three months before she died, Steinfeld posted to Instagram: “I am proud to be me I am proud to be trans I am beautiful I don’t care what people think.”

We have to do better.

A Living Language

Posted on September 28, 2017 by Sarah Shugars

Languages which are still being spoken are generally referred to as living languages. The metaphor is apt – languages are “living” not only insofar as its speakers are biologically living, but in that the language itself grows and changes throughout time. In a genuinely meaningful sense of the word, the language is alive.

This is a beautiful metaphor, but problematic for text analysis. It is, after all, difficult to model something which is changing while you observe it.

Language drift can be particularly problematic for digital humanities projects with corpora spanning a century or more. As Ben Schmidt has pointed out, topic models trained on such corpora produce topics which are not stable over time – e.g. a single topic represents different or drifting concept during different windows of time.

But the changes of a language are not restricted to such vast time scales. On social media and other online platforms, words and meanings come and go, sometimes quite rapidly. Indeed, there’s no a priori reason to think such rapid change isn’t a feature of all every day language – it is simply better documented through digital records.

This raises interesting questions and problems for scholars doing text analysis – at what time scales do you need to worry about language change? What does language change indicate for an individual or for a society?

One particularly interesting paper which tackles some of these questions is Danescu-Niculescu-Mizil et al’s No country for old members: User lifecycle and linguistic change in online communities.

Studying users of two online beer discussion forums, they find remarkably that users have a consistent life cycle – new users adopt the language of the community, getting closer and closer to linguistic norms. At a certain point, however, their similarity peaks – users cease changing with the community and move further and further linguistically as a result.

The language of the community continues changing, but the language of these “older” users does not.

This finding is reminiscent of earlier studies on organizational learning, such as those by James March – in which employees learn from an organization while the organization simultaneously learns from the employees. In his simulations, organizations in which people learn too quickly fail to converge on optimal information. Organizations in which people learn more slowly – or in which employees come and go – ultimately converge on better solutions.

Both these findings reflect the sociolinguistic theory of adult language stability – the idea that your learning, and specifically your language stays steady after a certain age. The findings from Danescu-Niculescu-Mizil, however, suggests something more interesting: your language becomes stable overtime in a given community. It’s not clear that your overall language will stabilize, rather, you learn the norms of a given community. Since these communities may change overtime, your overall language may still be quite dynamic.

The Yellow Day

Posted on September 27, 2017 by Sarah Shugars

I made the mistake of going outside today, so now all I can think about is how incredibly hot it is. For people who bask in warm weather, I suppose, it is not too miserable – but, for me, upper 80s at the end of September is more that I would hope for.

Mid-60s would do just fine.

If you’re wondering, the average high for Boston in September is a reasonable 73 degrees Fahrenheit. The record high, however, is a discomforting 102, achieved in 1881.

I was curious to learn more about that heat wave – hoping, perhaps, for some eloquently antiquated news paper articles on the subject.

Instead, I found something much more interesting. The record 102 temperature was reached on September 7, 1881 – the day after the “Yellow Day,” when “saffron curtain” mysteriously blanketed New England states.

It was eventually traced back to the great Thumb Fire of Michigan, one of the most devastating fires in that state’s history, burning over a million acres, but at the time, no one had any idea what was going on.

As the Boston Globe described:

Yesterday Boston was shrouded, and nature’s gloom soon infusing itself into the hearts of all made it a day long to be remembered, reminding one vividly of the famous dark day of years ago. About 7 O’Clock in the morning the golden pall shrouded the city in its embrace, and the weird unreal appearance continued throughout the day. As one approached a doorway from within and glanced out upon the sidewalk and street, it was difficult to dispel the illusion that an extensive conflagration was raging near, and that it was the yellow, gleaming light from the burning houses that produced the singular effect. Stepping to the sidewalk and glancing upward the roofs of the houses cut sharp and clear against the depths beyond.

A historian further described the eerie chromatic effects of the smoke:

The air became still, and calm, during that Tuesday, and people remarked about the odd tinge that colors took on as the day wore on. Plants were particularly brilliant – the odd light sharpening their green and blue hues. Lawns, usually a mundane green, took on brilliant color, and looked oddly bluish, in the day’s strange light. Yellow objects appeared colorless and white, and the color in red objects popped, while blue objects became ghostly. People in the street looked sickly and yellowish. Overhead, birds flew low in the skies.

The event was particularly startling because professed Prophetess Mother Shipton had reportedly predicted some two centuries before:

The world to an end shall come,
In eighteen hundred and eighty one.

As far as I can tell, however, the world did not actually come to an end that day.

Words and Topics

Posted on September 26, 2017 by Sarah Shugars

Reading articles skeptical of the veracity of topic model outputs has reminded me of this passage from Wittgenstein’s Philosophical Investigations:

Our language can be seen as an ancient city: a maze of little streets and squares, of old and new houses, and of houses with additions from various periods; and this surrounded by a multitude of new boroughs with straight regular streets and uniform houses.

In short: words are complicated. Their meaning and use shifts over time, building a complex infrastructure which can be difficult to interpret. Indeed, humanists can spend a whole career examining and arguing over the implications of words.

In theory, topic models can provide a solution to this complication: if a “topic” accurately represents a “concept,” then it dramatically reduces the dimensionality of a set of documents, eliciting the core concepts while moving beyond the complication of words.

Of course, topics are also complicated. As Ben Schmidt argues in Words Alone: Dismantling Topic Models in the Humanities, topics are even more complicated – words, at least, are complicated in an understood and accessible way. Topics models, on the other hand, are abstract and potentially inaccessible to people without the requisite technical knowledge.

To really understand a topic returned by a topic model, it is not enough to look at the top N words – a common practice for evaluating and presenting topics – you need to look at the full distribution.

But what does it even look like to examine the distribution of words returned by a topic model? The question itself belies understanding.

While “words” are generally complicated, Schmidt finds a clever opportunity to examine a distribution of “words” using ships logs. Each text contains the voyage of a single ship and each “word” is given as a single longitude and latitude. The “words” returned by the topic model can then be plotted precisely in 2D space.

With these visualizations of topic distributions, Schmidt raises important questions about the assumptions of coherence and stability which topic models assume.

He doesn’t advocate entirely against topic models, but he does warn humanists to be weary. And, importantly, he puts forth a call for new methods to bring the words back to topic models – to find ways to visualize and understand entire distributions of words rather than simply truncating topics to lists of top words.

Travel Ban: Take 3

Posted on September 25, 2017 by Sarah Shugars

Yesterday, President Trump issued his third travel ban. As you may recall, the previous Executive Order on this topic called for the “assessment of current screening and vetting procedures.” While the ban itself was suspended by numerous legal challenges, apparently the information gathering work was in fact completed.

The new travel ban effects nationals of 8 countries – nationals of Chad, Iran, Libya, Syria, Venezuela, Yemen, Somalia, and North Korea. Sudan was removed from the previous travel ban list, while Venezuela and North Korea were added. Six of the effected countries have majority muslim populations.

The new ban will remain in effect indefinitely.

Experts indicate that the new ban will be harder to challenge in court. It is more polished, more precise, and more removed from President Trump’s numerous anti-Muslim campaign comments. It ameliorates some of the most egregious problems with the initial, January 27 ban: there will be a several week delay before the new ban goes into effect, people who currently hold valid visa will not be effected by the new ban, and restrictions vary slightly by country, allowing, for example, Iranians with valid student visas to enter the country.

In short, this is what a politically savvy travel ban would have looked like in the first place. It has been thoroughly considered and vetted; carefully dressed up to give the impression of a relatively reasonable piece of U.S. policy.

But make no mistake: this travel ban still represents a grave overreach based in fear and racism. It is still unacceptable.

I have attended several travel ban protests in the last nine months and it looks as though in the near future I’ll be attending more.

And while attending those protests, I suppose I’ll be remembering Machiavelli’s advice to his beloved prince: If you’re going to do something terrible, start by doing something as terrible as possible. Then, when you benevolently scale back to something slightly less terrible, the people will appreciate your reasonableness and moderation.

That’s what a clever dictator would do.

Gender and Language

Posted on September 21, 2017 by Sarah Shugars

Both gender and language are social constructs, and sociological research indicates a link between the two.

In Lakoff’s classic 1973 paper, Language and woman’s place, she argues that “the marginality and powerlessness of women is reflected in both the ways women are expected to speak, and the ways in which women are spoken of.” This socialization process achieves its end in two ways: teaching women the ‘proper’ way to speak while simultaneously marginalizing the voices of women who refuse to follow the linguistic norms dictated by society. As Lakoff writes:

So a girl is damned if she does, damned if she doesn’t. If she refuses to talk like a lady, she is ridiculed and subjected to criticism as unfeminine; if she does learn, she is ridiculed as unable to think clearly, unable to take part in a serious discussion: in some sense, as less than fully human. These two choices which a woman has – to be less than a woman or less than person – are highly painful.

Lakoff finds numerous lexical and syntactic differences between the speech of men and women. Women tend to use softer, more ‘polite’ language and are more like to hedge or otherwise express uncertainty with in their comments. While she acknowledges that – as of the early 70s – these distinctions have begun to blur, Lakoff also notes that the blurring comes almost entirely in the direction of “women speaking more like men.” Eg, language is still gendered, but has acceptable language grown in breadth for women, while ‘male’ language remains narrow and continues to be taken as the norm.

A more recent study by Sarawgi et al looks more closely at algorithmic approaches to identifying gender. They present a comparative study using both blog posts and scientific papers, examining techniques which learn syntactic structure (using a context-free grammar), lexis-syntatic patterns (using n-grams), and morphological patterns using character-level n-grams.

Sarawgi et al further argue that previous studies made the gender-identification task easier by neglecting to account for possible topic bias, and they therefore carefully curate a dataset of topic-balanced corpora. Additionally, their model allows for any gamma number of genders, but the authors reasonably restrict this initial analysis to the simpler binary classification task, selecting only authors who fit a woman/man gender dichotomy.

Lakoff’s work suggests that there will be lexical and syntactic differences by gender, but surprisingly, Sarawgi et al find that the character-level n-gram model outperformed the other approaches.

This, along with the fact that the finding holds in both formal and informal writing, seems to suggest that gender-socialized language may be more subtle and profound than previously thought. It is not just about word choice or sentence structure, it is more deeply about the very sounds and rhythm of speech.

The character n-gram approach used by Sarawgi is taken from an earlier paper by Peng et al which uses character n-grams for the more specific task of author attribution. They test their model on English, Greek, and Chinese corpora, achieving impressive accuracy on each. For the English corpus, they are able to correctly identify the author of text 98% of the time, using a 6-gram character model.

Peng et al make an interesting case for the value of character n-grams over word n-grams, writing:

The benefits of the character-level model in the context of author attribution are that it avoids the need for explicit word segmentation in the case of Asian languages, it captures important morphological properties of an author’s writing, it can still discover useful inter-word and inter-phrase features, and it greatly reduces the sparse data problems associated with large vocabulary models.

While I initially found it surprising that a character level n-gram approach would perform best at the task of gender classification, the Peng et al paper seems to shed computation light on this question – though the area is still under theorized. If character n-grams are able to so accurately identify the single author of a document, and that author has a gender, it seems reasonable that this approach would be able to infer the gender of an author.

Still, the effectiveness of character n-grams in identifying an author’s gender indicates an interesting depth to the gendered patterns of language. Even as acceptable language for women converges to the acceptable language of men, the subtleties of style and usage remain almost subconsciously gendered – even in formal writing.

TERFs

Posted on September 20, 2017 by Sarah Shugars

Last week, an altercation related to a “What is Gender?” event occurred in Speaker’s Corner – “a traditional site for public speeches and debates” in London.

The event was organized by a group self-identified “gender-critical feminists” – essentially, women who don’t believe that all women deserve equal rights.

As you might imagine, in the face of such an event a group of protestors showed up to demonstrate in favor of the opposite: all women deserve to be treated with dignity and respect.

From there, details begin to get fuzzy, but it appears that a woman from the first group – the “gender-critical feminists” – began harassing and attacking a woman from the second group – those supporting equality. The attacker was eventually pulled off the victim, getting clocked in the face in the process.

Afterwards, pictures of the attacker’s bruised face began to circulate online, along with a questionable story. The woman – who can be seen in a video to be shaking another woman like a rag doll until a third woman intervenes – claimed that she was the real victim; the other women attacked her.

Except, she didn’t say women.

“Gender-critical feminist” is a palatable label adopted by women more colloquially known as TERFs – Trans-Exclusionary Radical Feminists. They are fervently passionate self-identified feminists whose feminism does not have space for all women.

In short, the attacker, having incited violence with seeming intention, proceeded to misgender her victims and successfully paint herself in popular media as just a normal old woman who was wrongly attacked while attempting to mind her own business.

This narrative is exceedingly dangerous.

Taken by itself, the event is unfortunate. Indeed, any time a person is attacked in the street is cause for concern.

But the narrative that emerged from this incident plays dangerously into broader misconceptions and stereotypes. It reinforces the idea that some women are inherently dangerous and that other women would be wise to distance themselves; it tacitly assumes that only some women are ‘truly’ women in some mystically vague sense of the word, while other women are not; and it erases and attempts to overlay the experience of women for whom these first two statements ring so obviously false.

It is gaslighting on a social scale.

Consider the account described in a statement by Action For Trans Health London, one of the organizations leading the demonstration against the TERFs:

Throughout the action, individuals there to support the ‘What is Gender?’ event non-consensually filmed and photographed the activists opposing the event. Often photos and videos taken by transphobes are posted online with the intention of inciting violence and harassment against trans activists. Due to this clear and documented history of transphobes violently ‘outing’ individuals of trans experience, visibility can be a high risk to trans individual’s personal safety.

During the action, a transphobe approached activists whilst filming with their camera. An individual then attempted to block their face from the lens of camera, leading to a scuffle between both individuals. This altercation was quickly and efficiently broken up by activists from both sides.

Action for Trans Health London later shared personal accounts from women who were assaulted by TERFs during the events of that evening.

Activists had good reason to be concerned for their safety.

Yet the stories emerging from that night don’t talk about the women who were assaulted. They don’t talk about the valid fear these women experienced when someone got up in their face with a camera. They didn’t talk about the pattern of violence and harassment these women face while just trying to lead their normal lives.

In fact, the stories do worse than ignore the incident all together. They blare the headline that a woman was hit during the altercation while reserving the full sense of ‘woman’ for the perpetrator; implicitly directing compassion to the person who did the attacking.

If you’re not familiar with the term, gaslighting is “a form of manipulation that seeks to sow seeds of doubt in a targeted individual or members of a group, hoping to make targets question their own memory, perception, and sanity.”

If you have never experienced gaslighting, be glad. If you have experienced gaslighting, you know that it is one of the worst possible sensations. You lose the ability to trust yourself, to trust your own instincts and senses. You lose the ability to know what is real due to the unwavering insistence of those around you that your reality is false.

And make no mistake, the dominant narrative emerging from the incident at Speaker’s Corner is a sophisticated form of gaslighting.

It is gaslighting when an attacker is allowed to mischaracterize their victims, it is gaslighting when the injuries suffered by an attacker are treated as more concerning than the injuries they inflicted, and it is gaslighting to pretend that people who have been systematically and zealously victimized are somehow the real perpetrators deserving of our scorn.

The sad truth is that there is an epidemic of violence against trans people. In the United States alone, at least 20 transgender people have been violently killed so far in 2017. Seven were murdered within the first six weeks of the year. Almost all were transgender women of color.

We cannot pretend that this violence isn’t occurring, and we cannot stay silent in the face of false narratives which wrongfully defame and mischaracterize an entire population of women.

I don’t know how to say it more plainly than that. To deny the rights of all women, to deny the existence of all women, and to deny the richly varied experiences of all women is simply unconscionable. You cannot do those things and call yourself a feminist.

I am not much of anyone, and it is always daunting to wonder what one small person can do in the face of terrible, complex, and systemic problems. I endeavor to do more, but literally the least I can do is to say this:

To all my transgender sisters: I see you. I believe you. And I will never, ever, stop fighting for you. I will not be silent.

Civic Studies

An intellectual community of researchers and practitioners dedicated to building the emerging field of civic studies

Author Archives: Sarah Shugars

Title VII

Automated Methods for Identifying Civilians Killed by Police

White Space

Crime and Hate

A Living Language

The Yellow Day

Words and Topics

Travel Ban: Take 3

TERFs