Disambiguation, In-Jokes, and Name Collisions: What You Need to Know When Naming a Python Project

Content notes: codes of conduct, colonization, natural disasters, ageism, racism, interpersonal violence, actual snakes

I gave this talk at North Bay Python. I’ve included the text of the talk below, as well as the video.

Names matter.

Names set expectations: a conference with a location name in its title  is probably in that location and a Python module with the word “test” in it will do something to do with testing. If North Bay Python took place south of the Bay, we’d all have some questions for the conference organizers.

Names create first impressions. Just reading the name of a project can immediately tell someone whether they’re going to use or contribute to that project. I’ve seen plenty of offensive library names and I’ve decided not to use those libraries just because of their names. I don’t need to look for a code of conduct or at the quality of the code of a project with an offensive name. I just move on. If you really want examples of those names, talk to me privately because I frankly don’t want to say them in public. 

There’s a business case against using software modules with problematic names, if you work in the sort of organization that requires a business case to do the right thing: any core maintainer unable to respond to feedback about a problematic name will be unable to  consider security risks that don’t impact them directly.

Names endure. Think about where we are right now. Petaluma. Sonoma County. California. The United States of America. The US got its name during the American Revolution. It was purposely structured to make a bunch of colonies sound like they were all working together and knew what they were doing. California is named for a magical island in a 16th-century Spanish romance novel. So, yeah, we use a name from a bunch of Spanish conquistadors thinking fondly of the love stories they read growing up.

Sonoma and Petaluma both come from Coastal Miwok words made to sound more acceptable to the Europeans who colonized this area. Sonoma’s root word means “Valley of the Moon”. Péta Lúuma was the name of the Miwok town located roughly here prior to the area being controlled by the Spanish, the English, the Spanish again, the Russians, the Mexican Empire, the Mexican Republic, the California Republic, and the United States.

Most dialects of Miwok are now extinct, due to the US’s genocidal policies with some European help. Before 1579, Coast Miwok tribal land included all of Marin County plus southern Sonoma County. By 1817, the only land still directly controlled by the Miwok was the Pacific Coast of the Marin Peninsula, from Point Reyes north to Bodega Bay. In 1920, Miwok and southern Pomo tribes tried to move to the 15 acre Graton Rancheria, held in trust by the federal government, but only 3 of the acres were actually habitable. The US government dissolved that trust in 1958 and ended federal recognition of the tribe until the Graton Rancheria Restoration Act was passed in 2000. 

The name “Petaluma” endures. 

Humans relate to names in deeply emotional and often unnoticed ways. Let’s go to natural disasters for a potentially less depressing example: researchers at the University of Michigan looked into whether people are likely to donate to hurricane aid causes if they connect with the hurricane’s name. Folks who shared a first initial with the hurricane’s name were significantly more likely to donate. That means that Kelly’s, Kai’s, and Khadija’s were more likely to donate to relief efforts for Hurricane Katrina than Hurricane Mitch. 

Correlation isn’t causation, but I will guarantee you that there is someone out there thinking about how much money they could raise by choosing a hurricane naming schema with a distribution of first initials more closely matching the distribution of first initials of all our names. 

Names don’t exist in a vacuum. Making sure we understand the context and nuances of potential names is a necessary first step.

Python Naming Schema

Question for the audience: Who here knows what inspired the name of the Python programming language?

Monty Python — that’s right!

Monty Python is a British sketch troupe who have turned 45 episodes made between 1969 and 1973 into a bunch of movies, books, games, a musical, and a few other things. 

I’m making sure to describe exactly what Monty Python is, because we can’t assume that everyone using the Python programming language also is culturally fluent in Monty Python. I have heard: “Monty Python? I think my grandparents used to watch that.”

That’s because Monty Python turns 50 next year. We might as well be asking programmers to get references to The Beverly Hillbillies, Petticoat Junction, or Green Acres. That in itself isn’t necessarily a problem, but some jokes age better than others. Monty Python isn’t aging particularly well. 

By expecting Python programmers to be familiar with Monty Python’s body of work, we’re sending them to look up sketches with titles that include the N word. I don’t think any of us want to accidentally endorse something like that. 

I’m not saying you can’t ever make a Monty Python reference again. I am saying we all have to consider the context of our references before slapping them on projects we want to share with the whole world. I’d ask you to think about it the same way you might think about putting a picture of an actual python in the middle of killing its prey up on your project’s homepage. No matter how cool all of you are with snakes, I am more interested in snake jokes and references than I am in snakes themselves.

Within the Python programming community, we’ve established some standard naming schemas, both formally and informally. 

We have PEP 8, the style guide for Python code, which includes naming conventions for variables, method names, and such. It doesn’t include suggestions for how to name projects, though there is a little piece of advice I’d love to have engraved on something shiny: avoid names that include the lowercase letter “l”, uppercase letter “O”, or uppercase letter “I”. I think that these suggestions should be followed for more than just single character variable names. They just make names easier to read and retype.

We also have less formal systems, especially for project names. For instance, many Python conferences and user groups include the city they’re located in their organizational names. Geography is a relatively easy way to disambiguate different local communities from larger overarching communities. We can tell that PyLadies Atlanta and PyLadies Santo Domingo are two separate groups. It gets even easier because there are unique identifiers for different locations in the form of IATA codes. IATA makes sure airport codes are unique, since there are very important differences between Portland, Oregon and Portland, Maine, if only because my cats are in Portland, Oregon and they expect me to fly to the correct city today in order to feed them dinner. IATA’s codes are so convenient that sometimes cities use them as nicknames or identifiers . Not all cities’ IATA codes are exactly intuitive. Chicago airports, you better believe I’m looking at you right now.

And while I will always struggle to remember that tickets to both ORD and MDW will get me to Chicago, the sort of “Py*” prefix used in PyLadies has become an easy-to-remember signal that a module is meant for use with Python. There are a variety ways “Py” has slid into Python project names, beyond just the prefix.

Some projects have chosen words whose letters already include “Py”, like Pyramid. “Pi”, spelled with an I instead of a Y, is pronounced the same way (at least in English), so there are also names that use nonstandard spelling like Project Jupyter. And then there’s the word “Pie”, which sounds like “Py” and is generally delicious. CherryPy, for instance, follows this naming schema.

A lot of projects have developed their own internal naming schema as well and for good reason. When related projects are branded together, users are more likely to realize that they can use all the different pieces you create. BeeWare, for instance, uses a combination of puns — so many puns — bees, and history, to create impressively descriptive names. First off, BeeWare has a subtitle, a full name, if you will. It’s BeeWare: the IDEs of Python. As in, what Julius Caesar was told in Shakespeare’s play of the same name, just before he was assassinated by a bunch of dudes in togas, except they’re referencing interactive development environments, rather than the middle of March.. 

As it happens, Toga is the name of a BeeWare library which provides wrappers for GUI APIs. Because of course, it is. Just like, of course, the name of BeeWare’s prevalidater is Beefore, with two Es. 

If it sounds like I’m saying a lot of names, that’s because we name a lot of things. We name them pretty fast, too — fast enough that there are multiples sets of projects sharing a name. I’m willing to bet that most of us in this room have heard of Django, the Python-based web framework. But there’s also a piece of tablature software for musicians, also named Django. 

Name collisions are annoying when you’re debugging a program, but they are a full on pain in the rear when you’re trying to search for help online. How many times have you walked away from a meetup or a conference with a Python module in mind to look up once you get home? You really only have three hopes for success for remembering the name:: the project has a memorable name, your new contact remembers to forward you that link, or you manage to find it through a search engine. You might wind up adding a bunch of keywords to your search terms, adding and removing words until you fight the right incantation to summon whichever library you’re looking for.

And if you’re looking for Python modules to help you with your herpetology research, I don’t even know what to tell you, except that I guess some herpetologists might have made the situation much more complicated by naming a species of python after Monty Python. 

Name collisions are our future, by the way. Unique identifiers are hard.

Name Selection

As we’re choosing names for Python projects, we have to keep all of these things in our head. Because that seems like an easy way to manage such a big decision, right? Nah, I’m just joking, that sounds super hard.

I actually just went over all of that stuff with you so that I could go over this checklist:

  • Check for how this name may be used elsewhere
  • Talk to actual humans about the name and what it means to them
  • Maaaaaaaaybe talk to a lawyer
  • Listen to the actual humans, no matter whether they respond positively or negatively
  • Make sure you can get any necessary accounts
  • Write down the meaning and history of your project’s name before you forget about it

In short, we’re going to go through a due diligence process before finalizing a name for a new project. Some of these steps can happen in different orders, but I want you to check off every item on this list before buying another sweet domain name. 

And if you’re worried that these rules will eliminate all the good domain names, rest assured that isn’t an issue. Let me put it this way: despite following these steps, I still own way too many domain names.

Check for how this name may be used elsewhere

Checking how a particular name is used can be as simple as running some online searches. Start with Google or DuckDuckGo or whatever flavor of search engine you prefer. Go a couple pages deep into those results. Depending on the words you’re looking at, you may get a Wikipedia disambiguation page, which honestly does a lot of the work here for you. On a practical level, we want to make sure there isn’t a major module out there already under a name you want to use. Wikipedia tends to be more up-to-date about programming modules than a lot of other topics. 

Do the same on Twitter, and click around to all those tabs that show up on the search page. You can look at what tweets mentioning your keyword are the most popular, usernames which include the keywords and so own. I’ve found that this is one of the easier ways to check both newly evolving words and to surface name collisions in other languages.

Go to Urban Dictionary, too. Before you actually type things into search boxes, though, I should definitely point out that Urban Dictionary is routinely NSFW. Twitter is also routinely NSFW, but for entirely different reasons. In this use case, you’re looking for NSFW terms that relate to your name — things that you may be too mature to respond to, but that other users would giggle at or find offensive. As such, maybe don’t assign this job to an intern or first-time contributor.

Basically, we’re looking for any terrible ways a bored thirteen-year-old can twist your project name. We can’t preempt every comment, but we do not have to make it easy. 

Talk to actual humans about the name and what it means to them

Next, and this is probably the hardest part, I’m going to need you to talk to actual humans, starting with any stakeholders interested in whatever project you’re working on. Depending on what kind of project you’re naming, your stakeholders could include employers and their lawyers, contributors (including documentation writers, who will need to type that name over and over again until it loses all meaning), and all of your friends who will have to listen to you talk about this project incessantly. 

You’re going to also want to check in with your potential users about your name. Spend just as much time on talking to potential users about naming as you do on more visual user experiences. Check how they spell the name when they hear it for the first time. Check how they pronounce it if they read the name before hearing it. This might sound like a marketing exercise, but it’s really a question of usability and accessibility. If a user can’t figure out how to spell a product name, they’ll have to deal with a high level of friction before even deciding whether learning about a given tool is worth scrolling. 

Tempting as it may be to ignore asking anyone outside of tech, I want you to make a point of running of names past at least three people who don’t work in tech. If you’re headed home for a holiday visit, feel free to ask your family to talk to you about your name idea. You can never guess who will run into what information and interact with it. You certainly can’t control it — if you could, every member of my family would be banned from ever seeing the word “Bitcoin”.

Your family members count as one person in this case, by the way. You can probably assume that your family’s background is close enough to your own that you’ll want to make sure to get some other opinions. You want to talk to folks who come from backgrounds different than your own because they’ll help you predict responses to that name from more people. 

Personally, I like to ask a kid about any names I’m considering. At the very least I will immediately hear any possible poop jokes about my name selection, long before any adult will work up the courage to discuss possible scatological references.

A person with a little distance from the project can also help you catch names that are hilarious to you but rely on insider knowledge. Sometimes those sorts of in-jokes confuse or exclude people, making it even harder to get up to speed with a new library. If you can’t explain the joke in the time it takes us to walk to wherever we’re eating today, your sense of humor may be too subtle for the rest of us.

One more note about humor: think about who you are making fun of. Punching up is funny. I feel comfortable snarking about Monty Python because the members of that comedy troupe have made serious bank and have legions of fans ready to talk about the good points of their careers. Punching down is boring. If I want to watch someone make jokes at other people’s expenses or use humor to be a bully, I can just watch the news. 

Check in with folks who speak languages you don’t, along with speakers of different dialects. Python is used all over the world, in more than 150 countries, so you’re never going to be able to check against all languages, but do what you can. 

Maaaaaaaaybe talk to a lawyer

Occasionally, a lawyer may be useful in this process. I am not a lawyer and therefore all I can say that there are very few situations in which I would personally feel the need to consult a lawyer before finalizing the name for a new project. It’s pretty much just edgecases, like wanting to use a name already claimed by a multinational corporation. Future Thursday may hate me for this, but any smaller legal issues are her problem. Do your own risk assessment, though.

Listen to the actual humans, no matter whether they respond positively or negatively

Why is listening a separate step? Because we don’t just want first impressions. We want to know what sort of connections endure. We want to know about those jokes that take people a minute to remember. We want to hear any “Oh crap do you know what that sounds like?” moments.

Make sure you can get any necessary accounts

On a purely practical level, you’re going to need to know if you can get domain names, GitHub repos, or any other accounts you might use. If you’re reusing a name that’s already been associated with one software project, you’re going to have to do a little extra running around before you can finalize your name.

Very few platforms have a system for aging out old accounts yet, so there’s no clear date when any software project is permanently inactive. Some open source projects go years between releases, with no real need to update a website or a Twitter account in between. You have to use your best judgement.

Check the old project’s materials for trademarks, copyrights, and other indicators of intellectual property. No matter how any of us feel about intellectual property laws, I think can assume none of us really want to get sued. It’s not impossible to unravel intellectual property from a defunct project. I have never found a name I consider worth that level of effort. 

Luckily, since we’re already talking to humans, we can just add the person last known to be running a given project to the list: if there’s a maintainer listed or even a contact form, just reach out and ask what’s going on with the project. If it’s actually defunct, ask if you can have their domain names and platform accounts. Make it as easy as possible on the other person and offer to pay for any fees, like domain transfer fees.

Even if there isn’t a pre-existing project, domain names can be tricky. Anyone working on anything related to any of the items on that Wikipedia disambiguation page could have already bought the domain name you want. Luckily, most Python projects are targeted towards audiences familiar enough with the internet to not flinch at the sight of non-dotcom domain names, like .io. That opens up options. Options for more research, I mean: That domain name is assigned to a part of the Indian Ocean currently referred to by the name British Indian Ocean Territory. Previously, that area was known as the Chagos Islands, mostly to the Chagossians, who the British forcibly removed from the Chagos Islands in 1966. Oh, look, another terrible example of how names reflect colonization! Using a .io address means taking this information into account.

Write down the meaning and history of your project’s name before you forget about it

Lastly, PLEASE PLEASE PLEASE document at least a little information about your project’s name. Just stick it in your README or some other piece of your documentation where it’s easy to find. Your documentation is a great place to put information about your project’s name and why you chose it. 

Write down the pronunciation of the name that you use, even if you think it’s obvious. If you really want to know why I started working on a Python-specific style guide, it’s because of PyQt. That’s four consonants in a row, just hanging out. I’m pretty fond of vowels, so I wasn’t sure at first how to pronounce PyQt. I googled it, but I did not have a lot of luck.

Address any potential issues right away. Make a note of what other uses of the project’s name that you’ll need to disambiguate from. Django documentarians might make a note about music, for instance, so that someone creating a Django plugin meant to be used by musicians will pick a name that will actually be searchable. Address any potential problems head on, as well. Including context about a location name, for instance, will let you make sure people know which Portland you live in. 

Go ahead and tell any jokes that go along with your project’s name right there in the documentation. Humor is amazingly memorable, but we only get that memory boost when we get the joke. It’s okay if not everyone gets the joke at first — jokes are still funny after they’re shared.

Takeaways

  • Names matter, even names that have existed for hundreds of years
  • Maybe don’t use Monty Python as a key part of your naming conventions
  • Complete due diligence on any name you consider