Monday, July 16, 2018

The Naming of Places (Part 8): The Sea

Next up on place names are oceans (or seas).  Currently there can only be one ocean name on the map, and it's hard-coded to be “Chrysalis Sea":
That name was borrowed from one of my reference maps.

The real world isn't much help in naming oceans or seas.  There are only 5 oceans in the world, and only a hundred or so “seas," many of which are actually bays, gulfs and the like.  The thesaurus isn't any help, either.  There are only two synonyms in English for large bodies of water -- “ocean" and “sea."  Even the historical thesaurus doesn't offer any additional vocabulary.  So it looks like I'm on my own.

To start with, I think weather qualities work for ocean names (“The Stormy Ocean"), water qualities (“The Glassy Ocean"), colors (“The Azure Ocean"), directional adjectives (“The Western Ocean"), climate adjectives (“The Steamy Ocean" / “The Frigid Ocean").  On these fantasy maps, the ocean is a place of mystery -- it's the edge of the known world -- so a lot of the mystery adjectives I used for the Lost Coast work here as well (“The Bloodstained Ocean").  Marine animals and seabirds should work too (“Amberjack Ocean" / “Albatross Ocean").  Some miscellaneous adjectives work as well.

Note that the ocean is usually singular on the maps, or at least distinguished, so using “The" with the name seems to work well.  I can also use the “Ocean of ..." form as well.
I went through all my reference maps and collected the ocean names to see how well it matched what I done so far (and to gather new ideas as well):
The Summer Sea
Eastern Sea
Western Sea
Southern Ocean (x4)
Endless Ocean
Eternal Sea
Vast Sea
Trackless Sea
The Great Sea
Black Sea
Silver Sea
Sea of Mist
Frozen Sea
White Sea
Twoways Sea
Ocean of Despair
Salt Sea
Sea of Glass
Sea of Narcissus
Dreamer Sea
Inner Sea
Twilight Ocean
Sea of Frost Ridge
The Jade Sea
The Silent Sea
Sea of Whispers
Chrysalis Sea
Sea of the Great Deep
Sea of Fangs
I'm surprised at how many of these I can already generate.  Here's a list of names generated using the ideas above:
Cormorant Sea
The Eerie Sea
Squid Ocean
The Peridot Sea
Codfish Sea
The Vermilion Sea
Hatchetfish Ocean
The Torrent Ocean
Jaeger Ocean
Sea of Maelstroms
The Icy Ocean
Razorbill Sea
The Dark Ocean
Sea of Cannibals
Oyster Ocean
The Ravaged Sea
Drum Sea
The Unlit Sea
This list seems a little heavy on sea creatures, but overall it seems a good selection. Unlike some of the other place names, I'm not generating any ocean names based upon invented words, but that's something I could easily add if I feel the need.

Generating the directional and temperature-related adjectives is a little challenging.  For something like a point or a bay, there's a pretty-well defined and small area, and the label is going to be close to that area.  But an ocean can stretch across much of the map, and so it's hard to pick a particular spot to serve as the reference point for temperature and direction.  In fact, I really want the temperature and direction to be based on where the label ends up.  So if I have (say) an ocean that stretches all the way from the bottom of the map to the top of the map along the left edge, I can call it the “Frigid Ocean" if I put the label up near the top, and the “Southern Ocean" if I put the label near the bottom.  But I pick the name before I place the label, so there's a bit of a problem here.

One workaround is to use the anchor point I select for the label as the reference point.  The label isn't tied very strongly to this point, but perhaps it will work for temperature and direction.
It isn't obvious, but this is a cold map (you can see some pine trees mixed into the forest), so “The Iceberg Sea" makes sense here.

When I implemented ocean labels I only created the one style shown above -- and angled, slightly curved label.  Since I'm working on ocean labels, I decided to add a couple of other styles.  First, a multi-line label:
And a single-line, horizontal label:
I don't much like this style.  One thing that might improve it is to let the label fall at an angle; that's something I've wanted to look into for some time, so I'll take this opportunity to implement it.

The basic implementation is to provide a range of allowed angles for each label, and let the label placing routine vary the angle along with the position as it is trying to find a good spot for the label.  (None of the evaluation code needs to change, because the criteria for a good placement hasn't changed.)  Generally speaking I will limit straight labels of this sort to the angle range of -45 degrees to 45 degrees.  Anything outside that range becomes hard to read, looks awkward, or is upside down on the map.

With that fix in place, I get this:
The red box here shows the original label placement (basically the same as the final label placement above) and the green dot marks the “center of the ocean" that is an anchor for this label.  The program has found an angle that allows it to slide the label up between the two islands to get the label closer to the anchor.

If I relax the pull towards the anchor point, this happens:
With less pull from the anchor, the label swings the other way and slides below the middle islands.  This is a better placement in some aspects (further away from other labels primarily).  

If I force a sea illustration onto the map, the ocean label will try to create some separation from the illustration, and this happens:
Now the label has swung back up between the islands, but pushed over a bit to stay away from the illustration.

The placement of the anchor point in this example actually isn't very good; it's near the center of the ocean, but too near islands in the ocean.  It would be better to find an anchor point that tries to stay away from land.  Also, it's nice for this anchor point to be close to an edge of the map that is largely ocean, so that the label indicates the greater mystery beyond the edge of the map.
At this point I realized that these are variants of a general case label that can move around, tilt, and have an arc to it.  And while I could always force a particular style by locking down any of those variables, I ought to just write the generalized label that would let the label placement algorithm change the location, angle and arc to try to find the best fit.  

(The exception here is the multi-line label.  SVG doesn't natively support multi-line text, so doing anything beyond stacking up a bunch of centered single-line labels quickly becomes very complicated.  In this case, I don't think multi-line labels work very well for oceans at any rate.)

Much hacking between that last paragraph and this one, but I have the general case working:
In this case, you can see the original label (the olive green outline) was arced upwards and fell partly off the map because the green anchor point is close to the edge of the map.  The final solution is arced downward and moved left and upwards so that it can both fit on the map and still be close to the anchor point (the bright green dot).

If I relax the requirement to stay near the anchor point it moves up and keeps the upward arc:
With that I'll declare victory on ocean labels.

Monday, July 9, 2018

The Naming of Places (Part 7): On Point

Next up in place names are points -- fingers of land reaching out into the sea:
Right now points are given place names from combining an invented word with “Point."  Let's try to improve on that.

I'll start at the usual place and gather some data about how points are named in the real world.  To start with, the common synonyms for “Point":
Point: 12674
Bar: 1123
Neck: 732
Ledge: 702
Rock: 542
Reef: 524
Shoals: 482
Head: 179
Peninsula: 158
Bank: 155
Unlike the other place names I've looked at, the synonyms for “Point" don't have anything like a power law drop-off.  In this case, “Point" is used overwhelming, with only a few other synonyms showing up.  A few other true synonyms show up further down the list:  “Flat", and “Horn" primarily.  There are also a few mountain-ish nouns mixed in; I suspect this is because you also sometimes see a mountain or ridge named “Point".

Here are the common adjectives used to directly describe points:
Long: 233
Middle: 106
North: 101
Indian: 91
West: 75
East: 74
White: 55
Grassy: 36
Big: 35
Great: 35
These are also familiar from bays.  (“Indian" was actually also on the bays list, but I didn't use it for the obvious reason -- doesn't really work on a fantasy map!)  Not a lot of creativity here; most points get adjectives describing their size or location.

One interesting type of adjective I notice is country-related:  American, Spanish, French, etc.  Since I do name countries/regions on my maps, I could conceivable do this same thing.  It looks to me that the rules for creating a “country adjective" from a country name are pretty simple (although there are many exceptions).

Now let's look at the most common nouns used in naming points.
eagle: 83
saint: 71
sand: 70
bluff: 63
oak: 54
birch: 53
tree: 52
neck: 51
goose: 51
pine: 49
Again, like bays, “saint" is a surprisingly popular choice.  In fact, there's a lot of overlap between bays and points -- all ten of these nouns already appear in my name choices for bays.

Finally, let's look at the adjectives used to modify those names:
old: 68
big: 57
lower: 38
long: 37
upper: 34
white: 29
lone: 26
great: 25
black: 22
green: 21
There's a lot of overlap here with both the direct adjectives above as well as the similar adjectives from bays.

My overall conclusion -- which matches my intuition -- is that points get named in much the same way as bays.  It probably isn't a perfect match, but I should be able to reuse most of the naming rules from bays for points.

There are a few bay-naming rules that don't work for points.  Most are just bay-specific vocabulary, like “Deep," but a more complex rule is the one that names a bay after a river that flows into the bay.  This doesn't make any sense for points (which at any rate rarely have rivers).  An alternate rule that does make sense is to name a point after a nearby bay.
On the left side of this map, the bay has been named after the city and the point has in turn been named for the bay.  (!) On the other side of the map, the point has been named directly after the city that is out on the point.

As is often the case, working on this surfaced a number of bugs in the point creation code.  For example:
I'm not sure what I think of naming geographical features after a Chicken Butcher, but the real problem here is the two co-named bays are too far apart.  I also found a number of problems where bays (and points) weren't being detected properly, and some cases where I thought there should be a bay and there wasn't.  I also added in some limits on the number of bays and points that can show up on a map.

I've fixed those, but there are probably still other tweaks I'll have to make for point names.  I'll deal with those as I see problems arise.

Monday, July 2, 2018

The Naming of Places (Part 6): Bay Watch

In the last posting, I mentioned that some of the common adjectives used for naming bays couldn't be applied randomly because they described aspects of the bay that could be determined from looking at the map.  If you name a bay “Big Cove" then it had better look big on the map.  In this posting I'll talk a little bit about how Dragons Abound will try do that.

Dragons Abound doesn't intentionally create bays; instead it identifies them on the coastline by searching for bay shapes.

The green line here on the coast shows a bay being detected.  At this point, Dragons Abound knows the start and end points of the bay, the coastline in-between, the center point of the bay, and the sinuosity.  I can also calculate the area of the bay from the outline.  This is most of the information I need to smartly apply many adjectives.

To start with, I'll look at directional adjectives.  From the location of the center point, I can decide where the bay is on the map.  If the bay is toward the top of the map I can call it  “North Bay" or toward the bottom “South Bay," and likewise for east or west.  Of course, that isn't exactly right for naming.  The map is presumably just a part of the world.  The top of the map isn't necessarily the “North."  And actually, places like “North Bay" don't alwaysget called that because they're in the North, but rather because they're north of some landmark.  For example, a bay on the northern end of a lake is likely to get named “North Bay."  All the same, it will make more sense to the reader if “North Bay" is in the top part of the map.  This also means that if I happen to get a “North Bay" and “South Bay" on the same map, “North Bay" will actually be north of “South Bay!"

I could divide the map in halves and call anything on the top “North," etc., but this would lead to bays just barely above the center of the map getting labeled “North," which is probably bad.  So I'll divide the map in thirds, and anything in the middle third won't be eligible for a directional label.  I can also name the areas that are both “North" and “East" to “Northeast" and so on.  Here's a map with three bays labeled that way:
These all happen to be in the corners so they get the compound names.  Here's an example where the bay falls in the middle third of the map:
If I want to use these directions as adjectives, I can tack “-ern" onto the end of them and then use them as a possible adjective directly on the bay name, e.g., “Northern Bay" instead of “North Bay."

However, directional adjectives don't really work in other phrases, such “Northwestern Bayberry Cove."  The problem is that a directional adjective would only be added to a name like “Bayberry Cove" to distinguish it from a similar location with the same name.  So unless there's another “Bayberry Cove" around, the directional adjective is not really needed.  And that's not likely to happen with my place name generator.

Unless I do it intentionally, of course.

The idea would be to find two bays on the map, give them the same name, and then add directional adjectives to distinguish them.  The directional adjectives in this case are a little different.  Unless we're amazingly lucky, the centers of the two lakes are going to be on a diagonal -- they'll be separated in both north/south and east/west.  So is we use the kind of directional adjectives I used above, they'd always end up “Northeast/Southwest" or a variant.  In this case, what we really care about is which direction has the most separation between the bays.   And then we use “North/South" or “East/West" as appropriate to name the bays.
I might want to only do this with bays that are within some distance of each other, and maybe only with bays that are on the same shared sea.  But for the moment, if I decide to do this on a map I'll just pick the two closest bays.

Now let me move on to another case:  size adjectives.  I can calculate the area of the bay and use that to decide that the bay is big and should be described as “Great" or that the bay is small and should be described as “Little".  Of course, big and little are relative terms; I'll have to try some examples and figure a cut-off size that works well.  I'm helped here that these are supposed to be names that people gave these bays; people are imprecise about things like size, so I don't have to worry too much about getting the cut-off exactly right.
You'll notice I've gotten the synonym “Narrow" on this map.  I could actually calculate the longest and shortest axes of the bay and decide if it was more oblong or round, but I don't think that's really worth it; people will use their imagination to fill that in.

Just as I could use the directional adjectives to distinguish two bays with the same name, I can use size adjectives the same way.  This works much the same way; I pick two bays near each other and call the larger one “Big Dolphin Bay" and the smaller one “Small Dolphin Bay".
Or in this case, Lesser/Greater Tapicer Cove.

Since I'm working with the bay sizes, I also want to distinguish the bay synonyms based on size.  I've generated a couple of test maps where a small bay gets labeled “Gulf" or a large bay gets labeled “Pocket," which is jarring.  So I'd like to divide the synonyms up by the sizes they imply and then use them appropriately.  Here's how I broke down the synonyms:

Cove [125]Bay [166]Gulf [17]
Inlet [42]Anchorage [83]Basin [8]
Hole [21]Sound [28]Reach [3]
Eddy [13]Bight [14]Refuge [1]
Pocket [8]Bottom [8]Arm [1]
Bend [6]Slough [6]
Neck [4]Gulch [4]
Hollow [3]Lagoon [3]
Prong [3]Wick [2]
Branch [2]Firth [2]
Pool [2]Voe [2]
Horn [2]Gut [1]
Narrows [1]Cut [1]
Gap [1]
Gully [1]
Indent [1]

The numbers in brackets indicate the relative probability of picking each option.  It's long been known that vocabularies follow a power law distribution (Zipf's law).  I don't know if synonym choice follows a similar pattern, but it seems reasonable, and power law distributions seem “natural" to people, so I used a rough power law distribution on the synonyms as well.
Another bit of information I have about bays is temperature.  I can use this to decide whether a bay is hot (“Jungle Bay") or cold (“Arctic Bay").  This works the same way as the other choices.
On this map (which is entirely in Arctic regions), I get “Winter Bay" and “Frosty Basin."

That covers the descriptive adjectives, but there's a similar case I want to implement as well -- naming the bay after a nearby map feature.  In the arctic map just above, both of the bays have cities on their shores.  In these cases, naming the bay after the city makes sense, e.g., “Winter Bay" could be named “Kigil Bay."  This could make sense for other features as well, for example, having a point and a bay with the same name.  But for the moment I'm only going to name after cities.

One tricky aspect of this is that there are often multiple cities on a bay:
In this case, it seems like the bay ought to (usually) be named after the biggest city.  Of course you can imagine situations where the bay would be named after one of the smaller cities, so I'll include that as a (lower-likelihood) possibility as well.
You might recall in my previous posting that I noted that the synonym “harbor" is a little different from the other bay synonyms because it implies that people have created a harbor on that bay.  So when I name a bay after a city, I include the possibility of calling it a harbor, e.g., “Chatgar Harbor."

Finally, I also want the option to name a bay after one of the rivers that flows into the bay.  So in this case:
the bay might get named “Ruzu Cove" after the Ruzu River that flows into the bay.  To do this, I need to make sure that I've named the rivers before trying to name the bay, and then look through the rivers that end in the bay to find the largest named river flowing into the bay.
Now the bay is named after the river.  In this case I've forced all the bays to be named after rivers, but note that the upper bay seems to have a random name.  That's actually the name of one of the rivers flowing into that bay, but the labeling algorithm has decided the river is too small to display that name, so it has been left off the map.

And that about wraps up my current plans for bay place names.  I'll keep adding vocabulary as I have ideas and tweak the probabilities of the different naming options, but that can wait for the future.  A test run with all the possibilities turned on:

Tuesday, June 26, 2018

The Naming of Places (Part 5): At Bay

Now that I have a grammar tool more-or-less working and retrofitted the Lost Coast names to the tool, I'm going to tackle another type of place names: bays.
This picture illustrates how Dragons Abound currently handles bays:  It identifies a stretch of coastline that creates a “pocket" and then attaches a label to the center of the pocket.  Currently the place name is just an invented name from Martin O'Leary's generator plus one of a few synonyms for “bay," such as “basin" and “harbor."  Let's see if I can do better.

I'm going to start out by looking at how bays are named in the real world.  As I did previously with mountains, I'm going to analyze real world naming data to see what kinds of words are used in bay place names.  In my previous effort, I analyzed place names data from the USGS.  Since then, I've discovered another source for real world place names:   This organization provides a worldwide geographical naming database under a Creative Commons license.  Like the USGS database, names are labeled with a feature class that I can use to extract out all the “bays".  I can then run some lexical analysis on these names to help guide how I create my fantasy place names.

For a first step, I'll run all the names through and look at the last word, which is usually “Bay" or a synonym.  Here are the top 20 results:
  1. Bay: 9482
  2. Cove: 6674
  3. Harbor: 1280
  4. Creek: 904
  5. Bayou: 850
  6. Inlet: 404
  7. Pond: 363
  8. Lake: 302
  9. Arm: 297
  10. Sound: 291
  11. Lagoon: 281
  12. Hole: 228
  13. Canal: 216
  14. Bight: 204
  15. Pass: 169
  16. Basin: 167
  17. Slough: 134
  18. Anchorage: 90
  19. River: 84
  20. Eddy: 59
There is some duplication here because many of the same features appear in both the USGS data and the Geonames data.  Some notes:
  • “Bay" and “Cove" are by far the most common names.
  • “Harbor" is an interesting synonym, because it presumably applies only to bays where people have created a harbor.  For Dragons Abound purposes, this suggests only using it where a city is on the bay.  “Anchorage" also implies a human use to the area, but in that case it doesn't imply a city.
  • “Creek," and “River" are odd synonyms to apply to a bay.  Looking at some examples in Google Maps suggests that these are places where a creek/river empties into a larger body of water and the area has been identified as a bay:

    which I think means I can safely ignore these names for my purposes.
  • Something similar is happening with “Pond" and “Lake"; these seem to be mostly ponds and lakes connected to some bigger body of water.  I'll ignore these too.
In the end, I have about 40 synonyms for “bay", most of which will be used very infrequently.

Next, let me look at the nouns that I have been used in bay names, but not as the last word of the name:
  1. island: 269
  2. creek: 215
  3. lake: 181
  4. port: 180
  5. point: 159
  6. saint: 118
  7. arm: 111
  8. sand: 91
  9. river: 85
  10. goose: 81
  11. rock: 77
  12. mud: 75
  13. mill: 70
  14. harbor: 70
  15. oyster: 64
  16. spring: 59
  17. head: 50
  18. hill: 48
  19. house: 46
  20. beach: 45
You can see some interesting trends here.  There are a lot of bays named after other (presumably nearby) geographical features: islands, creeks, lakes, points, arms, rivers, springs, heads, hills and beaches.  Harbor and its synonym port show up again.  Terms for earth are common:  sand, rock, and mud.  Animals, too:  goose and oyster.  An interesting term is Saint -- many geographical features are named after saints.

I looked at about the hundred most common terms and tried to break them down by category:

Geo Featuresisland, creek, lake, point, arm, river, spring(s), head, hill, branch, pond, hole, brook, fork, finger, key, neck, wash, sea, bluff, isle, mountain
Manmade Featuresport, mill, harbor, house, shelter, camp, boat, garden, yacht, castle, fort, town, sawmill, schooner
Personsaint, squaw, king
Natural Itemssand, rock, mud, salt, water, stone, boulder, granite
Animalsgoose, oyster, eagle, fish, deer, turtle, bass, horse, seal, fox, otter, alligator, cow, clam, beaver, mosquito, herring, swan, buck, bird, bull, salmon, pigeon, sheep
Plantswillow, oak, pine, tree, hickory, stump
Mischammock, devils, mile, greens, basin, drum, echo, sunset, tar, winter, paradise, dollar, crescent

Almost all of the terms fall into just six categories.  Of course, as you go further down the list into less common terms they become more diverse, but I can take this as a starting point to elaborate a more complete set of possible nouns to use in naming bays.

It would be nice if there was a simple automated system for elaborating these categories, but there's not.  It's mostly a matter of grunt work to research and capture terms, and to try to decide along the way how much categorization is worthwhile.  For example, to elaborate the “animals" category, I have to go to various websites with lists of animals and decide which ones to include, and whether it is worthwhile to categorize them as land versus sea (yes), whether a “pelican" should go into land, or sea or air (maybe both sea and air?) and whether insects should be included (no).  Eventually I end up with a list of land animals:

AardvarkBushbabyCougarGila MonsterJaguarMule DeerPythonTapir
BadgerCatDormouseHareLynxPeccarySea LionWater Moccasin
Boa ConstrictorChickenEmuHogMeerkatPolecatSlothWildebeest
BoarChimpanzeeErmineHoney BadgerMinkPonySnailWolf
Box TurtleChuckwallaFoxHyenaMongoosePorpoiseSpiderWombat
BuffaloCobraGeckoImpalaMoosePrairie DogSquirrelWoodrat

After completing this, it occurred to me that I might like to have a separate list of polar animals, so that I can use more appropriate names when adding bays in cold climes.  So I went back and broke out the list of polar animals.

penguinternmoosenorthern pike
polar bearerminepuffincod
snow foxstoatsnow goosewhiting
snowshoe harelemmingwalrusflounder

So now in a snowy clime I'll get “Penguin Cove" instead of “Camel Cove."

If this sounds like a lot of work, it is.  But for something like this, a shortcut is not going to produce a good result.  In fact, I'm probably not doing *enough* work on this; a really good solution would capture more semantic information about the words and use that to construct the place names.

An initial pass generates a few thousand possible nouns to use in bay names, and generates names like these:
  1. Swirl Cove
  2. Headrail Gulf
  3. Annihilation Sound
  4. Illuminator Bay
  5. Fool Bay
  6. Silk-Dresser Cove
  7. Monastery Bay
  8. Moneyer Bay
  9. Beryl Bay
  10. Table Bay
You can see that there are some archaic terms in the list that you might not recognize.

A simple extension is to flip a name like “Swirl Cove" into “Cove of Swirls."  Note that in most cases I have to pluralize the noun for this to sound right -- but not mass nouns like “mud" or abstract nouns like “annihilation."  
  1. Shale Bay
  2. Cove of Chamberlains
  3. Bay of Furriers
  4. Gum Bay
  5. Pocket of Nakerers
  6. Rogue Cove
  7. Bushbaby Hole
  8. Mud Cove
  9. Brewer Bay
  10. Orca Cove
Another variation for nouns that represent individual people is to use the possessive, e.g., “Rogue's Cove" or “The Rogue's Cove."  I can also use an invented name for the individual, and I can add a title to that individual as well (as I did when naming the Lost Coast).
  1. Cartier Bay
  2. Oilskin Gully
  3. Viper Cove
  4. Steward Bay
  5. Dolphin's Bay
  6. Purse Maker Bay
  7. Vicar Daedoe's Prong
  8. Catchpole's Bay
  9. Reverend Beedae's Cove
  10. Duednoeb's Cove
Interestingly, it sounds fine to use animal names where individual names work, which is why I get “Dolphin's Bay" above.

Another way to name a bay is with an adjective, such as San Francisco's “East Bay" or “Long Bay" in Myrtle Beach.  To get an idea of what sort of adjectives I can use to name bays this way, I'll go back to my corpus of US and UK bay names and pull out all the adjectives that are used this way (as best I can).  Here's the top ten:
  1. North
  2. West
  3. Long
  4. Big
  5. East
  6. Little
  7. Grand
  8. Middle
  9. Hollow
  10. Indian
  11. Sandy
  12. Horseshoe
  13. Browns
  14. Great
  15. Round
  16. Blind
  17. Flat
  18. Twin
  19. Finger
  20. Blue
Directions are very popular.  (South is missing, but that seems to be a problem with my Parts of Speech classifier not thinking it is an adjective.)  Sizes are also popular: Long, big, little, grand, great.  A little further down are shapes:  Hollow, horseshoe, round, flat, twin, finger.  “Browns" is a little odd; I think that's a possessive with a missing apostrophe:  “Brown's Bay."  At number 20 we get an actual color -- a bit of surprise to me because I thought colors would be more popular.

One note about direction and size adjectives -- I probably don't want to use these randomly.  A “northern" bay ought to be to the north in some way, and “great" bays should be larger than average.  I'll talk about implementing that next time, but for the moment I'll just collect these adjectives and set them aside.
  1. Admiral dem Gen's Bay
  2. Chokeweed Bay
  3. Indigo Pass
  4. Paradise Bottom
  5. Deserted Bay
  6. Pufferfish's Bay
  7. Damnation Bay
  8. Catfish Cove
  9. Quiet Bay
  10. Glassy Cove
I've turned up the frequency to make the adjective names more common, but you see here “indigo", “paradise," “deserted,", “quiet," and “glassy" as examples.

The next option is to use both an adjective and a noun to name the bay, such as “Blue Dolphin Bay."  In this case, the adjective modifies the noun, not the bay, so a different set of adjectives is needed.  In theory, the adjective should be something that makes sense with the noun, but in practice the human mind is so good at conjecturing connections between an adjective and a noun that it's almost difficult to come up with an example that seems totally wrong.  For example, “Virgin Dolphin Bay," “False Dolphin Bay," and “Barbarian Dolphin Bay" are all cases where the adjective doesn't really fit with the noun, but seem like acceptable place names.  I'm sure it's possible to come up with combinations that totally don't work, but my point is that you don't have to be as careful about this as you might imagine.  However, you do have to avoid abstract nouns.  Something like “Blue Infidelity Bay" is nonsensical enough to trigger even the most forgiving reader.

An initial cut at adding these sorts of names gives me (again, with tweaked probabilities) this list:
  1. Sore Brigantine Bay
  2. Imashosh's Bay
  3. Dead Raft Bay
  4. Lone Cedar Bay
  5. Dukdush's Cove
  6. Broom-Dasher Bay
  7. Cove of Hermits
  8. Dukuklen's Bay
  9. Roadhouse Bay
  10. Parson Prong
Here “Sore Brigantine" doesn't make much sense (a brigantine is a two-masted sailing vessel) and “Dead Raft" isn't much better, but I think they would pass on a map.  (“Lone Cedar" on the other hand is nearly perfect.)

That's it for the first part of naming bays.  The lists of nouns and adjectives can be expanded near endlessly, but I have about 3000+ in the vocabulary right now, which makes for plenty of different names.  In the next posting I'll get into some of the less random naming strategies.

Obligatory map shot:
Limner's Bay," “Elk Bay," and “Penny Firth."

Wednesday, June 20, 2018

The Naming of Places (Part 4): Using a Tool

One thing I learned in working on “Lost Coast" names in the last posting was that treating words as lists of strings is very limiting.  For example, I added some common monster names to the words I was using, and I could use those in various ways:  “The Vampire Coast," “The Coast of Vampires," “The Vampire's Graveyard," and so on.  But I ended up just duplicating those names in different templates.  It would be nice if I had a way to say “Include all the monster words here."  Likewise, I sometimes need the plural form of a word, and I had to add that in manually.  So it seems like it's time to consider using a language generation tool.

One of the main features I need is to pick a word randomly from a class of equivalent words, as for example when I select a word that means “coast."  It would be nice to have a tool that provides built-in word categories that would solve all my needs.  Some tools (like WordNet) provide word classification into various categories, and also provide lists of synonyms.  Unfortunately, these tools usually categorize words by parts of speech (POS), e.g., noun, adjective, verb, etc., which isn't useful for me.

Some tools also provide synonyms, but they're not usable without manual editing.  For example, WordNet synonyms for coast include “seashore" and “lakeshore", neither of which work well in labeling a coastline.  And there's no way for WordNet to know that in this case, “cliff" is an acceptable synonym for coast, even though those words do not mean at all the same thing.  There are shades of meaning in words that are difficult to capture and use in a language library.

So I don't think I'm going to find a tool with built-in categories and synonyms that I can use without modification.  Instead, I'll need a tool that supports creating my own categories, e.g., to be able to say “words to use in labeling a coastline are coast, shore, strand, bank, ... etc."

I also want to be able to weight the choices in these synonym lists to indicate that some choices are more common than others.  Quite a few of the tools I've looked at only support choosing randomly (with equal probability) from a list.  I can work around that limitation by repeating the same word multiple times in the list, but if I want a word to be 100 times as common as another word, that requires a lot of repeats.

Another feature I'd like is for the tool to be able to pluralize (and conversely, singularize) words for me, so that I don't have to enter each word as both a singular and a plural.  I can use this to switch between “The Ogre Coast" and “The Coast of Ogres" easily.  I think there will still be some cases where I have to explicitly indicate a plural or a singular but in many cases I could rely on the tool to create the appropriate plural / singular.

There are also a number of common features I don't need.  I'm not building interactive fiction (IF) so I don't need any facilities for interacting with the user.  Some tools have ways to remember a choice and use the same choice later, so that you can (for example) pick the name “Pete" in one place and then use it throughout a long text.  I can think of some cases where I might want to do this, but in general it's a feature I can live without.  I also don't need to parse text, assign parts of speech to words, or similar input-oriented tasks.

With all that in mind, I think my best choices come down to Tracery and RiTa.

Tracery is focused on text generation, and has a simple, clean format for expressing a grammar and some nice built-in features like capitalization.  On the negative side, it lacks any way to do choice weighting.  (There exists a fork of Tracery that adds choice weighting (and much more!) but it is written in Swift rather than Javascript.)  In fact, Tracery does a “shuffled deck" selection when making choices, meaning it runs through all the choices (randomly) before repeating itself.  Since I want some place name choices to repeat (e.g., I want to use “coast" much more often than “bracks"), this won't work for me.

RiTa is a more general toolkit that provides features like analyzing text, conjugation, stemming, Markov chains, and more.  The grammar format is similar to Tracery, but lacks a number of features (like capitalization) that Tracery provides.  However, it does provide choice weighting.  On the negative side, it's much bigger (about 10x) than Tracery, but it has some reduced versions and I suspect I can make do with the smallest.

For the moment, at least, I'm going to work with RiTa.

Both Tracery and RiTa do generation with context-free grammars.  If you aren't familiar with context-free grammars, they consist of rules that look something like this:
<coast> => coast | shore | banks
This rule means “Wherever you see the symbol <coast>, replace it with the word coast, or shore or banks.'  You can chain these rules:
<lost> => lost | forgotten | accursed
<name> => The <lost> <coast>
Together, these rules say that you generate a <name> by generating the word “The" followed by whatever the <lost> symbol generates and then whatever the <coast> symbol generates.

(The “context-free" part just means that you can only have one symbol on the left side of a rule.)

As you may remember, I can also name a lost coast after a monster, e.g., “The Zombie Coast".  With a grammar, I can now separate out the monster names:
<coast> => coast | shore | banks
<lost> => lost | forgotten | accursed
<monster> => Zombie | Kobold | Orc
<adj> => <lost> | <monster>
<name> => The <adj> <coast>
The <adj> symbol can now expand into one of the synonyms for lost, or a monster name.

Whenever the grammar engine has to make a choice, it chooses randomly among the options.  So in this case, “coast", “shore" and “banks" are all equally likely names.  If we want “coast" to occur more frequently than the other choices we can (in RiTa) add a frequency to the choice:
<coast> => coast[5] | shore | banks
which says (in this case) to pick “coast" five times out of seven.

That works well to intentionally adjust word frequencies, but random choice can cause a more subtle problem.  Consider, for example, this rule:
<adj> => <lost> | <monster>
50% of the time this rule will use a “lost" adjective, and 50% of the time it will use a monster name.  But in my case, I have 325 synonyms for lost, and only 33 monster names!  I really want to pick an adjective equally from that whole pool.  I can fix this by adding a frequency to this rule:
<adj> => <lost>[10] | <monster>
but this requires me to count the choices in each category, calculate the ratios, and keep all this up-to-date as I add new terms and options.  That's error-prone, and gets complicated when there are multiple levels of rules.

These sorts of rules represent composition rather than choice -- we'd really like to have some syntax like
<adj> => <lost> & <monster>
to indicate that the grammar engine should compose the two lists together before choosing.   Neither RiTa or Tracery seem to have this capability.  Maybe I'll add that, but in the meantime I'll use a workaround of defining <lost> and <monster> in Javascript and combining them when I create the rule:
let lost = 'lost | forgotten | accursed';
let monster = 'Zombie | Kobold | Orc';
<adj> => lost + ' | ' + monster
Apologies for the psuedocode mish-mash, but I hope you understand what I mean.

Another thing I need to do in name generation is to insert the result of a Javascript function call.  For example, I can name a lost coast after some (imaginary) person:
Mesh's Boneyard
In these cases, the name of the imaginary person is generated by Martin O'Leary's place name generator, with a call that looks like this:
Language.makeName(world.lang, 'person')
So if I want to be able to generate a name like this, I need a way to tell the grammar “Hey, at this point go off and execute this Javascript and use the result."  This is called a “callback", and in RiTa this works by enclosing it in backticks:
`Language.makeName(world.lang, 'person')`'s <coast>
There's a lot of  quoting going on there, but the important part is that the call to Language.makeName() is inside backticks.  When the grammar engine evaluates this rule, it knows to pull out that bit of code, run it, and put the result back into the rule.

It turns out that this isn't as straightforward as it looks.  Without getting too technical, every bit of code executes in a context that represents all the other code and definitions around it.  In this case, code was written in one context but gets executed in a different context.  This creates no end of problems.  For example, in the rule above, “Language.makeName" isn't defined in the context where the code actually gets executed, and so the callback fails.

RiTa has a solution for this problem, but it isn't very good.  I patched my copy of RiTa with a better solution (and provided that back to the RiTa authors) so my callbacks work as I expect.

RiTa has a number of ways to actually write a grammar, but I'm using a JSON format.  Here's what the core of the Lost Coast naming rules look like:

     // The Lost Coast
     '<lc1>': 'The <lost> <coast>',
     '<lc2>': 'The <coast> of <lost2>',
     '<lc3>': "The <sailor>'s <negcoast>",
     '<lc4>': "`Language.makeName(world.lang, 'person')`'s <negcoast>",
     '<lc5>': "<noble> `Language.makeName(world.lang, 'person')`'s <negcoast>",
     '<lc6>': "<admiral> `Language.makeName(world.lang, 'person')`'s <negcoast>",
     '<lc>': '<lc1>[10] | <lc2>[5] | <lc3>[3] | <lc4>[3] | <lc5> | <lc6>',

<lc1> through <lc6> are the basic patterns for different Lost Coast names (as described in the previous posting).  In <lc4>, <lc5> and <lc6> you can see callbacks to Martin O'Leary's name generator.  The last rule sets the proportions to use the various forms, “The Lost Coast" being the most common and forms like “Admiral Dyg's Boneyard" being fairly uncommon.

There are a couple of functions I think I might eventually want to use that aren't in RiTa or Tracery.

One is the capability to remember and reuse a name across different runs of the grammar.  For example, I create “Admiral Dyg's Boneyard" I might want to name a nearby rocky point “Admiral Dyg's Folly."  Or if I name a mountain “Black Rock Peak" I might want to name a nearby city "Black Rock Town."  I'm not exactly sure the best way to do this yet.

Another is the capability to adjust the distribution of some terms on a per-map basis.  For example, I might have a distribution for the names of bays:
'<bays>': 'Bay [10] | Cove [5] | Basin [3] | Bight | Estuary'
Mostly I want to use “Bays" and “Coves" and rarely “Bight" or “Estuary".  This reflects the relative proportions of those names on today's maps.  But maybe on this map, bays are mostly called “Bights" and only occasionally other names.  Right now I don't think there's a way to write a “meta-rule" to figure out the distribution within another rule.

Next time I'll start to work on bay names.

Thursday, June 14, 2018

The Naming of Places (Part 3): The Lost Coast

I'm still not entirely sure about my approach to generating place names, but I'll start off by working on place names for “The Lost Coast."  The idea with the Lost Coast was to create a stretch of flat, empty coastline on the map and then give it an evocative name to create a little mystery.  So far I've just worked on creating and labeling a spot, but I've just been using “The Lost Coast" label for all of them as a placeholder.
I'd like to generate a name that indicates that this stretch of coast is empty for some mysterious or negative reason.  It's filled with poisonous vapors, perhaps, or frequented by pirates, or ... well, use your imagination.  Place names tend to be short and pithy -- names like “Pirate Island" and not like “The Small Island of Cimmerian Pirates and a Mountain With Mysterious Caves," so to start with I'll just try to create names in the form “The <Adjective> Coast," where the adjective phrase is just a single word, e.g., “The Lost Coast," “The Empty Coast", etc.  (And I'll elaborate further as I go.)

This is a little more focused than naming something like a mountain range, because in this case I know I want a name that conveys a specific meaning or mood, so the choice of names is much more constrained and manageable.

The first step is to collect appropriate adjectives.  To do this, I start with “lost," “empty" and some other appropriate adjectives that occur to me and then I use to find appropriate synonyms.  And I repeat that with all the synonyms I find, and so on.  Altogether, that takes about four hours.  I also add in a list of common fantasy monsters (e.g., “The Troll Coast").  I don't go looking for archaic terms because at this point I have almost 350 terms:

forgotten bitter  trackless
cursed heartless  burning 
damned bloodthirsty  flaming
doomed demonic  smoldering 
accursed malevolent  festering
blasted monstrous  putrefying
infernal craggy  rotting 
unlucky jagged bubbling
dead fatal  reeking 
unfortunate perilous  stinking
empty treacherous  malodorous
barren menacing  fetid
arid savage  rancid
deserted vicious  mephitic
desolate barbarian  leaden 
abandoned primitive  afflicted
forsaken savage  grieving
godforsaken feral futile 
destitute uninhabited  sterile 
forlorn lethal  impassable 
lonely toxic  merciless 
solitary pestilent  unforgiving
dreary grisly  cutthroat
lonesome terrible  harsh 
ruined hideous  acrid 
friendless uncharted  hard 
dismal unmapped adamantine
backward undiscovered  callous 
gloomy unexplored ruthless 
miserable untraveled killing
wicked unseen  slaying
stony secret  diabolic 
desperate enigmatic  nefarious 
tragic cryptic  criminal
wretched arcane outlaw 
hidden unknowable bandit
bleak delphic  raider 
somber sybilline  buccaneer 
windy augural corsair 
black fatidic picaroon
funeral vatical rogue 
melancholy incomprehensible  blackguard
mournful untold  charlatan
weary uncounted trickster
ravaged masked  freebooter 
ghastly cloaked highwayman 
murky clandestine privateer
dark concealed  robber
doleful enshrouded fugitive
dolorous tainted  marauder
hopeless infected berserker
shadowy defiled  brigand
sepulchral spoiled  pariah 
caliginous emaciated  leper
wailing harsh  pirate
abominable weather-beaten  rover 
nefarious weatherworn satanic
profane lightless  sinister 
worthless unlit foreboding
cruel frigid unlucky 
rocky frozen  ominous 
dangerous icy  horrible 
violent algific fiendish 
wild shivering teratoid
deadly wintry  bloody 
dire brumal blood-soaked
grim hiemal bloodstained
unknown torrid  blood-spattered
shrouded blistering suicidal 
veiled scorching malefic
blighted sweltering louring
gaunt searing rabid 
windswept austral ancient
tenebrous blazing primeval
stormy broiling primordial
starless scalding baleful 
stygian steaming mephitic
devastated recalescent abhorrent
pillaged desecrated  frightful
plundered looted  horrid
shattered stolen  false
spoiled  purloined deceptive
wasted poacher's Basilisk
ashen broken Centaur
pallid  blasted  Chimera
uncanny  cracked  Cockatrice
foggy  fractured  Djinni
misty  crippled  Dragon
glowering  splintered  Dwarven
impenetrable corrupted  Elvish
smoky  gray  Gargoyle
nubilous cinereal Gnoll
cimmerian  drab Gnome
sunless  muddy Goblin
dolent dusty  Gorgon
woebegone  sallow  Griffon
impossible  ashy  Hippogriff
deathly  colorless  Hobgoblin
weeping  faded Hydra
howling eerie  Kobold
groaning unexplained  Manticore
growling unnamed Medusae
shrieking nameless Minotaur
keening mysterious Mummy
bellowing occult  Ogre
whimpering mystical  Orc
crying orphic Skeleton
screaming acroamatic Spectre
barking unearthly  Ent
dreadful  ethereal Troll
godless haunted Vampire
unhallowed  tormented Wight
pagan vaporous Wraith
heathen impassable  Wyvern
unholy  pathless Zombie

These aren't in any particular order, but they're all “reasons you wouldn't want to be here" in some sense or another.  Some are very specific, such as “deathly" or “blood-spattered" but others only hint at why the coast is uninhabited:  “eerie", “shrieking" and so on.  It was a lot of work to put this list together, and I can see that it's going to be a lot of work if I have to do this for every type of place name.  On the other hand, I'm not sure I have any better approach available.  For cases where I have a list of example place names (e.g., a list of river names in the US), I can take advantage of that to generate a list of candidate names.  But I don't have a good list of example place names for “Lonely Coasts" and I found in going through the synonyms that I had to apply a lot of judgement to pick and choose among the synonyms.  So maybe this is the best approach available.

It's also apparent to me that if these words were annotated with meaning, it would make it easier to re-use the lists.  For example, suppose I want to create a “Lonely Mountain" as in The Hobbit -- a single mountain in the middle of a plain -- and give it a place name.  The words in the Lost Coast list that have some semantic connection to the concept of “lonely" could be reused to name the mountain -- “The Abandoned Mountain," “The Forgotten Mountain," and so on.  An initial approach might just be to sort the words into categories of this sort.  But some of the words that have nothing to do with loneliness seem suitable as well -- “The Haunted Mountain," for example.  So perhaps it's just going to require a lot of work and a custom approach for each place name.

I also want some variation in the “Coast" part of the place name, so I go through a similar process to collect synonyms for coast.  There are far fewer of these -- really, in common usage the only synonyms are “coast" and “shore", although “strand" and “bank" are probably familiar to many people.  For some variety, I can throw in archaic forms and words for “cliffs" -- even though on my maps these areas are not usually cliffs.  That gets me this list:

coast foreshore bracks
shore tidewater bluffs
shores seaside scarps
strand shoreside glint
bank seaboard ledra
banks cliffs staithe
slakes cleo brink
outland cleeves rivage
warth cloughs verge
wash heughs

Some of these words are fairly obscure, but context on the map should make the meaning obvious.  Note that in this list, I have to manually decide whether using the singular, the plural or either makes sense.  For example, “shore" can be either (e.g., “The Lost Shore" or “The Lost Shores") but “coast" only works in the singular, and “bluffs" only works in the plural.

Now I can create a new label by selecting randomly from each list:
The Wraith Coast
The Violent Verge
The Grim Coast
The Bandit Bank
The Baleful Coast
The Delphic Bank
The Deceptive Slakes
The Primitive Warth
If the modifier in the Lost Coast name is (or can be) a noun, then an alternate form of naming is to say “The Coast of <noun>."  For example, another way to say “The Wraith Coast" is “The Coast of Wraiths."  This variant adds some more variety to the naming, but notice that you cannot just swap the words around between the two forms.  For that reason, I've set up a second list of adjectives specifically for this form that (mostly) don't overlap with the first list of adjectives.
The Kobold Strand
The Dark Rivage
The Strand of Blasphemy
The Tidewater of Lost Hopes
The Seaside of Chimeras
The Bank of Monsters
The Brumal Strand
The Shore of Manticores
This tends to produce too many obscure terms for “coast".  I need to weight the “coast" terms so that the more common usages get picked more frequently than the obscure ones.  Because of the size and diversity of the adjectives list, it's not really a problem to pick among all those alternatives equally.  But the “coast" list is much smaller and some of the synonyms are quite obscure so I probably don't want to use them too often.  The weights may take some tweaking, but here's a weighted sample:
The Sweltering Coast
The Bank of Lost Hope
The Louring Shore
The Devastated Shore
The Pagan Coast
The Unholy Shores
The Coast of Griffons
The Cleeves of Kobolds
The Bluffs of Centaurs
The Coast of Orcs
The list is now much heavier on the common terms like “coast" and “shore."

The Coast of Vampires, in ti Numru Bay, Zomle Zo

Normally, a generic geographical feature might be labeled with someone's name, like “Jason's Point" or “Frederick Mountain."  That's not something that's too commonly done with coastlines (in truth, coastlines are rarely named per se), but at any rate it's not something I can do here because a name like “Jason's Coast" doesn't convey any mystery or explanation about why the coast is deserted.  I'm relying upon the adjective portion of the place name to convey that information.

However, it occurs to me that perhaps I can use the noun portion of the place name to provide this meaning.  For example, I could call the coastline “Jason's Graveyard" and provide some of the same atmosphere as names like “The Forgotten Shores."  The noun I use has to have a negative connotation and designate a place, but it doesn't have to be explicitly linked to the coast, because the placement of the label should indicate to the reader that it is the coastline that is being labeled.  So something like “Graveyard" should work.  I could probably get a bit further afield with phrases like “Jason's Calamity" but that starts to sound too much like I'm labeling a specific event that happened in that location.

For the proper name part of this place name I use the existing fantasy name generator, to get names like “zum Nassir's Graveyard."  I can use the proper name by itself, but I can also add a title to the name (like “Commodore") to give it more of a naval feel, or I can use a noble's title (like “Baron") to imply that this area is famous and named because of a bad thing that happened to an important man here.
Obviously these names can get quite lengthy, so I'm going to have to go back and add some logic to shrink the size of the label when the name gets long.

But I don't have to use a proper name as the first part of this place name.  I can use a generic name, like “The Sailor's Graveyard."  (Note that I have to add “the" to the start of the place name when I'm using a generic noun.)  Here I'll want to use names related to sailing, such as “sailor," “mariner," “pirate," and so on to reinforce the image I'm trying to convey.

Here's a representative sample of the possible names:
The Colorless Coast
Retgres's Litten
The Uncounted Coast
The Cutthroat Coast
The Monstrous Bank
Commander Lozchem's Necropolis
The Weatherworn Coast
The Cryptic Bank
The Shipwrake Shore
The Hobgoblin Cliffs
I that makes a nice variety of names, but I'll continue to be on the look out for new possibilities.  (And if you have any ideas, please suggest them in the comments!)