Dialogue

"I Very Clearly Remember an M and an F." –Moira Rose

Followers of my Instagram account @SchittsSheets know that all of my spreadsheets are titled with a line of dialogue from Schitt's Creek. Taking a cue from Roland and Jocelyn's mix-up with Moira seeing the sonogram of their baby, I named this spreadsheet "I Very Clearly Remember an M and an F."

Introduction

"Hi, I'm Moira Rose. And if you like fruit wine..." 

Sorry, my mistake; this is a different kind of introduction.

I was reading about gender and data bias, specifically a study conducted by the Geena Davis Institute (they have done some excellent work around film, television, and gender). I was struck by the following statistic about the roles of men and women in film and television:

"Men don't just have more roles, they also spend twice as much time on screen – this rises to nearly three times as much when, as most films do, the film has a male lead. Only when the lead is female do men and women appear about as often as each other (as opposed to women getting, as you might expect, the majority of screen time). Men also get more lines, speaking twice as much as women overall; three times as much in films with male leads; and almost twice as much in films with male and female co-leads. Again it is only in the few films with female leads where male and female characters drew even on screen time."

My brain was immediately intrigued, and I wondered how Schitt's Creek fared against this statistic. Given that I had the transcripts for each episode, it was "only" a matter of marking each line of dialogue with the character and their perceived gender and then gathering and analyzing the data. When I put out a request for help, many Schitt's Creek fans generously volunteered to help mark each line of dialogue (over 21,000 lines!) with the character name and then I pulled them all together in one spreadsheet workbook. Thanks to the help from others, I was able to save about 30 hours of work, for which I was very thankful considering the gathering and analyzing of the dialogue still took me 51 hours to complete. 

Assumptions

"Never assume, dear. It makes an ass out of both of us."

Hopefully I haven't played into Moira's fears here, but I had to make assumptions in most cases about characters' and writers' genders. Schitt's Creek does not have any identified trans or non-binary characters so I used characters' pronouns and story information to determine their gender. In regards to the gender of the writers, some writers of the show have their pronouns in the bio of their social media and some do not, so I had to make assumptions for some writers' gender as well. 


Caveats

"We've done the best we can!"

If you read the original studies (cited at the end), they were done based on sophisticated tracking of time on-screen for characters and lines of dialogue. While I am committed to my spreadsheets and Schitt's Creek to a great degree, I am not committed enough to watch every episode of Schitt's Creek with a stopwatch, tracking the amount of time each character spends on screen. I have a life. Lol. In addition, I don't have the original scripts, so I don't have an accurate way to assess "lines" of dialogue. Not knowing what constitutes "a line" of dialogue, I decided "a line" to simply be the uninterrupted dialogue of a character. As long as what they said wasn't interrupted by another character, it was a line. But, because of the great amount of variability in the number of words per line, it wasn't an adequate measure of each character's dialogue. 

Timing characters' speaking parts wasn't going to be accurate since some speak slower (hello, Johnny and Bob) or faster and I didn't have a way to equalize their rates of speech like they did in the study. Therefore, I decided to simply measure based on how many words a character said. Using "number of words spoken" as my means of measurement differs from the original studies, so I don't know if the exact same results would come from the studies if they had used the same measurement as I did here. Technically, since my means of measurement differ from the studies I'm using for comparison statistics, I'm comparing apples to oranges. But c'est la vie. "We've done the best we can!"

As far as the statistics for individual writers, I urge caution when it comes to making any assumptions about an individual's writing. One important aspect to keep in mind is that not every episode was written by one writer. Some writers wrote only one episode, some co-wrote one episode with another writer, some writers co-wrote a few episodes, etc. There is a lot of variability in how many episodes were written solo or co-written. Because some writers co-wrote episodes, their individual data includes co-written episodes and the influence of the other writer is an invisible, unaccounted-for factor. Additionally, it's my understanding that the writers' room for Schitt's Creek was pretty collaborative, so the effect of that on each episode is a big unknown factor. No one wrote in a vacuum. Truly, no writer's statistics should be compared to anyone else's due to these factors. 

Another important fact to keep in mind is that the sample set of writing is quite small to make any definitive conclusions about anyone's writing. Schitt's Creek had 80 episodes, but the most that any one writer wrote individually was 22 episodes, and the average number of episodes written by an individual writer was only six. Even for writers who wrote more than the average, it's still a small sample set, and conclusions should be held very loosely. 

These aspects regarding the writers' statistics were what drove my decision not to include any writer's specific name with their statistics. The sample set was too small and too co-mingled with co-written episodes to come to valid conclusions about their individual writing, and the point of this project wasn't to uncover any specific individual's biases in writing. Nevertheless, I decided to include the data about individual writers, while obscuring their specific identity, because I think the statistics are interesting to consider just as data.

Results

"Well, it's a smiley face, so I'm assuming it's a positive result."

The resulting product is nine spreadsheets focused on individual characters' statistics, gender-specific statistics, and writer-specific statistics. Using the statistic from Invisible Women as a guide, Schitt's Creek fares very well overall. Since it has male and female co-leads, we would expect the male characters in Schitt's Creek to speak twice as much as female characters if it were to match the statistic in the book (that would be a percentage of 100% more than female characters). If you only look at the number of words spoken by male versus female characters, the actual result is that male characters speak 15.5% more than female characters simply based on the number of words said in each episode. Those results are much better than the studies' results.

Gender-Based Results

Though the overall result of 15.5% more dialogue for male characters than females is better than the statistic from the studies, each season had more female characters than male characters and yet female characters accounted for an average of only 46% of the dialogue. For example, in season 6, there were an average of 5.5 male characters per episode and 6.4 female characters per episode. You would expect the dialogue then to be 46% for male characters and 54% for female characters (because 46% of the characters were male and 54% were female), yet the opposite is true. Male characters in season 6 spoke 53.5% of the dialogue and females had 46.5%. While the imbalance isn't huge, every season had similar statistics with male characters speaking more than half of the dialogue yet having less than half of the characters. 

Season 2 is the most imbalanced between gender and dialogue. In season 2, there were an average of 5.0 male characters and 6.9 female characters per episode. The resulting dialogue was 52.1% going to male characters and 47.9% going to female characters. If the dialogue matched the gender ratio, male characters should have had 42.1% of the dialogue and female characters 57.9% of the dialogue. Again, the percentages are literally reversed from what they should be to match the gender ratio. Season 4 is the most closely balanced between dialogue and characters. There were an average of 5.2 male characters and 5.7 female characters per episode. Male characters had 50.9% of the dialogue and female characters had 49.1% – much closer to the expected averages of 48% of the dialogue going to male characters and 52% going to female characters; only a 6% loss of dialogue for female characters.

Character-Based Results

The best analysis of character dialogue would be of the Roses' lines, since all four were in all 80 episodes. You could add in Roland and Stevie since they were in 74 and 78 episodes respectively, but since they weren't main characters, their data does skew the results.

Examining the results of the Roses, they averaged 476 words each per episode. The elder Roses' personal averages were pretty close to that statistic: Johnny spoke an average of 472 words per episode (0.6% less than the average), and Moira had 491 (3.3% more than the average). Looking at the younger generation reveals a greater gap. David had 7.8% more words than the average (513 words) while Alexis had 10.4% less (426 words). It may come as a surprise that, of the four Roses, Alexis had the least number of words (34,098) and David spoke the greatest number of words (41,070). It's interesting that the elder Roses' dialogue is much closer in parity than the younger Roses and there may be more to explore in this area in terms of dialogue parity for older characters versus younger characters.

The Roses' statistics are easy to compare in terms of each of them being in the same number of episodes and, arguably, all being of equal importance. When considering minor characters, the comparisons get a little messier since they're in differing numbers of episodes and offering different qualities to the show that would naturally lead to unequal time spent speaking. Nevertheless, it is interesting to look at the top seven minor characters' dialogue along gender lines. The top seven characters I classified as "minor" are Jocelyn (appeared in 66 episodes), Twyla (63), Ronnie (46), Ted (43), Bob (29), Mutt (23), and Ray (14). (I have classified Patrick as a major character. Even though he was in only 39 episodes, the importance of his character in terms of story and therefore dialogue gave him an average word count that was closer to the Roses' amounts than the minor characters.) Of the seven minor characters identified, they have an average words said per episode of 124. The breakdown along gender lines is dramatic though. The male characters were in an average of 27 episodes and have an average word count of 143. In contrast, the female characters were in an average of 58 episodes – over double the male characters yet only have an average of 99 words said per episode, 30% less. 

It's probably a fairer comparison to consider the character's role in the show and story to compare their statistics. Both Twyla and Ted play against Alexis's character, appearing often in scenes with her. Twyla appeared in 63 episodes, and Ted appeared in 43 episodes. Twyla's word count is 32% lower than the average (83 words per episode), while Ted's is 74% higher (216 words). Bob and Ronnie are another comparison where their character and storyline are about of equal importance. Ronnie was in 46 episodes and Bob was in 29. Ronnie had 47% fewer words than the average (64 words) and Bob had 4% fewer words (118 words). 

Writer-Based Results

Of the 80 episodes of Schitt's Creek, 13 people were given writing credit on the show – seven men and six women. Sixty-eight episodes were written by only men (66 by a sole writer, two co-written by two male writers), eight episodes were written by a sole female writer, and four were co-written by a male and female writer. Of the 13 writers, only two male writers wrote episodes for all six seasons of the show; the other 11 came and went over the course of the series. Every season had one female writer credited, but season 4 was the only season where more than one female writer was credited with writing an episode. 

If you look at the episodes written by male versus female writers, the statistics are very different. Episodes written solely by male writers gave male characters an average of 7% more words than female characters. Episodes written solely by female writers gave male characters an average of 30% more words than female characters. This is quite a dramatic difference, but it's impossible to make a valid conclusion about the comparison since 68 episodes were written by male writers and only eight episodes by female writers – it's a small sample size for the female writers.

Episodes that were co-written by a male and female writer vary. Of the four co-written, only one gave more words to female characters (8% more words). The other three episodes gave 17%, 35%, and 37% more words to male characters than females.

The individual writers varied widely in the percent of dialogue they wrote for male versus female characters, even from season to season. None were consistent in the percentage of words they wrote for one gender or another. One writer wrote 37% more words for male characters in one season and then the next season wrote 47% more words for female characters, a range of 84 points. (The smallest range in writing percentages was 18.) In fact, for the writers who wrote for more than one season, wide variations from season to season were more the norm. 

When the writers are listed in order by their percentage from the greatest number of words for male characters to the greatest number of words for female characters, the range is 86% more words for male characters to 46% more words for female characters, with an overall average of 20% more words written for male characters. The extremes on this scale of 86% and 46% though are both attributed to writers who wrote only one episode each. In general, the more episodes a writer wrote, the closer they moved to the middle of the scale.

Conclusion

"Although I made some excellent points, in the interest of a fair and balanced discussion, I will now argue the other side of the issue."

The statistic that sparked this whole project cited that twice as many lines were spoken by male characters than female characters when male and female characters co-lead, so the end result of this analysis of Schitt's Creek with only 15.5% more words spoken by male characters than female is a marked improvement. Nevertheless, it isn't equal, especially considering that each season had slightly more female characters than male. Female characters usually received in the 40%-range of the dialogue. I would be interested in how Schitt's Creek compares to other TV shows on air during this same time period, but I'll leave that analysis to someone else. I have Schitt's Creek reruns to watch. 🙂 

The Raw Data

(the first spreadsheet I've ever posted in its entirety)
I Very Clearly Remember an M and an F

Works Cited

Criado-Perez, Caroline. INVISIBLE WOMEN: Data Bias in a World Designed for Men. Abrams Press, 2021.

“Geena Davis Inclusion Quotient.” Geena Davis Institute, 22 Jan. 2018, https://seejane.org/research-informs-empowers/data/.

“The Reel Truth: Women Are Not Seen or Heard.” Geena Davis Institute, 15 Aug. 2018, https://seejane.org/wp-content/uploads/gdiq-reel-truth-women-arent-seen-or-heard-automated-analysis.pdf.

Smith, Stacy L., et al. “Gender Roles & Occupations: A Look at Character Attributes and Job-Related Aspirations in Film and Television.” Gender Roles & Occupations, Geena Davis Institute on Gender in Media, https://seejane.org/wp-content/uploads/key-findings-gender-roles-2013.pdf.