Skip to content

Interesting data:

What surprises me about this is not that passport holding has risen so much but that it was so low as recently as 30 years ago: There were only 7 million passport holders in 1989. That doubled by 1991 and doubled again by 1995.

Roughly speaking, this means that about 3-4% of American adults owned a passport in 1989. Doesn't that seem startlingly low? I would have guessed double that for business travelers alone.

Here's another oddity: after quadrupling between 1989-95, over the next 27 years the number of passport holders went up more than 350% while the number of overseas travelers went up only 150%:

During the pandemic, international travel collapsed but passport holding just kept on increasing at the same rate as always. Odd. I suppose part of this might be attributed to the popularity of having a passport as ID rather than for travel. Is that a thing?

You'll notice a small upward bump in passport holding in 2007-09. This is probably in anticipation of new rules that took effect in 2009 requiring a passport for travel to Canada and Mexico. However, the effect is small and passport growth continued apace after that. It doesn't explain much.

Now we're talking: Maine and Colorado say Trump can't run for president while Michigan and California say he can. No matter how you come down on this question, it really does seem like it needs a definitive national answer. Supreme Court, here we come.

I accidentally found myself last night on the home page of the State Department's list of (non-natural) overseas deaths, which Congress has required it to compile since 2003. I learned a few things:

  • Americans like to go to sunny places and then drown themselves.
  • Man, do a lot of Americans die in Mexico.
  • An awful lot of Americans travel overseas to commit suicide.

You want some stats? I've got 'em. First, here are overseas deaths by year:

Good news! I plotted deaths versus number of overseas trips and it looks like travel to foreign countries has gotten safer and safer. In 2010, about 18 travelers per million met an untimely demise. By 2022 that was down to 8 per million.

Here are the most popular ways to die:

Auto accidents for the win! If you put together all road accidents (auto, bus, motorcycle) it's the biggest category by a landslide.

Finally, here are the most dangerous places. This is not really very useful since obviously places with lots of visitors will also have lots of deaths. Also, the State Department does a lousy job of standardizing place names, so these numbers aren't 100% accurate.

That said, Tijuana takes the fatality honors. It's the most dangerous city in the most dangerous country for non-natural American deaths. Unsurprisingly, your basic European tourist destinations (London, Paris, Rome) are pretty safe. You might have a heart attack there, but you probably won't die of non-natural causes.

I should note that this list is semi-random. I chose all the places with very high death counts except that I got bored with Mexico after a while and gave up. Aside from that, I included popular places plus places that seemed to have an awful lot of deaths considering their (probably) small number of American visitors. Riyadh, anyone? Seoul? Phnom Penh? Chiang Mai?

Over at National Review, Jeffrey Blehar tells us that Nikki Haley's real gaffe last night wasn't racism or something racism adjacent. Oh no:

The real gaffe Haley committed on Wednesday was that, when she froze up under an unpredicted question and defaulted to her factory settings in answering, those answers demonstrated such contempt for the intelligence of her voters. We can be told the Civil War was about slavery, Nikki — we’re all adults here.... That’s why this little gaffe, however minor, memorably reveals something about Haley; we rarely get such accidental insight into how little politicians think of their own voters.

Uh huh. Haley's real problem was not realizing how enlightened modern Republicans are about issues of race. Color me skeptical. Consider this news item from 2015:

The division over what children should learn in school is clear in Texas, where academic standards list slavery third among the causes of the war, after sectionalism and states’ rights — written deliberately in that order to telegraph what some elected Texas officials described as slavery’s secondary role in driving the conflict.

Slavery was a “side issue to the Civil War,” said Pat Hardy, a Republican member of the State Board of Education, when the board adopted the standards in 2010. “There would be those who would say the reason for the Civil War was over slavery. No. It was over states’ rights.”

In a poll that same year, only 49% of Republicans agreed that slavery was the main reason for the Civil War.¹ Nearly half of Republicans say it would be bad to pay more attention to the history of slavery and racism. 73% of Republicans say anti-white bias is as big a problem as anti-Black bias—a number that's increased steadily for the past decade. 82% of Republicans oppose Black Lives Matter and 76% oppose even nonviolent protests against Black deaths. Only 6% of Republicans believe statues of Confederate leaders should be taken down. In Congress, two years ago, Republicans voted 2:1 not to remove statues of Jefferson Davis and other Confederate leaders from the US Capitol.

I think Haley was keenly aware of what the Republican base wanted to hear from her. She didn't misjudge either her voters or their intelligence. She just forgot that these days they'd mostly shrug and figure she had mentioned slavery because everyone has to toe the PC line in public. It's a damn shame, but what can you do? The wink and nod are understood.

Believe it or not, though, some good news has come out of all this. You may recall that Donald Trump was asked a few years ago about the roots of the Civil War and produced this deathless answer:

People don’t realize, you know, the civil war — if you think about it, why? People don’t ask that question, but why was there a civil war? Why could that one not have been worked out?

Today, however, Trump's MAGA PAC slammed Haley because she "was unable to say that the Civil War was about slavery." That's progress! Trump now knows what the Civil War was about.

¹In fairness, Democrats didn't do that much better (62% agreed). And in a more recent poll only 70% of Americans could even identify the North as the winner. So maybe the real issue here is just that we're all idiots.

Last night Nikki Haley was asked about the cause of the Civil War. Oh, it was this and that, she said, freedoms and the role of the government, that sort of thing. How about slavery? “What do you want me to say about slavery?” Haley asked.

As near as I can tell, this is the accepted Republican view of the Civil War, along with climate change not existing, COVID coming from a Chinese lab, and Democrats stealing the 2020 election.

But the best part of this little gaffe came this morning, when Haley went on a radio show and said, well, of course slavery. Duh. And then said the whole thing was politically motivated:

Haley on Thursday also accused the man who asked her the question of being a Democratic plant.... “We see these guys when they come in. We know what they’re doing,” Haley said.

I suppose it's too subtle to point out that this question wasn't exactly the result of deep oppo research. I mean, the guy probably was a Democratic plant, but so what? It was a softball question. No one asked if Haley was still beating her husband. It's like asking if you disapprove of the Holocaust. The answer is yes. If you then want to add a few remarks about not forgetting the gay victims, the gypsies, and others in addition to the six million Jews, that's fine. But it's not a hard question.

Likewise, the cause of the Civil War was: disagreement over slavery and the expansion of slavery. If you then want to add some stuff about tariffs and federalism as other thorns in the side of the South, that's fine. But it's hardly a gotcha question, unless you happen to be running for the presidency in the modern Republican Party.

Apropos of nothing in particular, I thought it might be a good time to take a fresh look at Russian-controlled territory in Ukraine these days:

The southern portion of the Russian territory is east of the Dnieper river and is all but impossible to attack successfully. The Dnieper is a big river. Further north, the natural obstacles are less formidable but the territory adjoins Russia and is easy to keep armed and supplied.

So, I dunno. Maybe once Ukraine gets those F-16s in the air it will make a difference. At the moment, though, it's a little hard to see how either side is going to break this stalemate without some significant change in circumstances.

POSTSCRIPT: Like what? The aforementioned F-16s have potential. A cutoff of aid to Ukraine thanks to US political infighting could do it. Russian exhaustion is possible, similar to Germany's in World War I. Direct entry into the war by the US would definitely do it, but Russia would have to do something phenomenally stupid to provoke that.

My previous post was about a New York Times lawsuit accusing Microsoft and OpenAI of copyright infringement for training their AI models on Times data. This prompts me to remind everyone of what "training" really means. Most people have a vague idea that AI models just trawl the internet and ingest whatever they find, which then gets analyzed and regurgitated by a vast robotic algorithm. And that does happen. But there's way more to it.

AI models are useless without a huge amount of human assistance. In particular, they rely on vast and expensive amounts of annotation, sometimes called labeling, that explains to the AI just what it's seeing. This is a roughly $3 billion (and growing) business populated by dozens of specialist companies like Scale, LabelBox, CloudFactory, and others.

There are several different types of annotation:

  • Text, including semantic, intent, and sentiment annotation. Semantic annotation is the simplest kind: armies of human annotators are asked to tag, say, all song titles or all proper names or all telephone numbers in a piece of text. Intent annotation involves telling the AI whether a sentence is a question or an assertion or an opinion. Sentiment annotation is about tagging emotions. This tells the AI whether, say, a review is positive or negative, or whether a statement is an insult or a sarcastic compliment.
  • Image, which is most commonly object recognition. This can be simple or detailed, such as tagging buildings in an image (simple) or tagging in detail all the parts of a building (doors, windows, ledges, etc.).

    Examples of text annotation (left) and image annotation (right) from Scale, one of the leaders in the AI annotation industry.
  • Video, which is similar to image annotation. This is one of the most common types of annotation because it's used in huge volumes by driverless car companies. In many cases, video is literally tagged frame by frame in order to recognize motion and change over time.
  • Audio, which mostly involves classifying different types of sound (music, human voices, animal sounds, police sirens, etc.).

Routine labeling is done by armies of English speakers in developing countries for a couple of bucks an hour. Popular sites include Kenya, India, and the Philippines. High-end labeling for things like law, medicine, and other specialized fields, is mostly done in the US and can pay pretty well.

The holy grail, of course, is automating the annotation process so that humans are no longer necessary. So far, only about a quarter of annotation is done via software, but that's likely to grow as AI models become more and more capable.

The point of all this is that "training" is not merely the act of hoovering up data and then spitting it back out. It's a laborious process that's useless unless a lot of value is added in the form of human labeling. Among other things, this is one of the reasons that I think training AI on copyrighted information is OK. It's truly a transformative process, not merely a simpleminded and derivative copy and paste.

The New York Times is suing Microsoft and OpenAI for training their AI software on Times stories:

In a complaint filed Wednesday, the Times said the technology companies exploited its content without permission to create their AI products, including OpenAI’s humanlike chatbot ChatGPT and Microsoft’s Copilot. The tools were trained on millions of pieces of Times content, the suit said, and draw on that material to serve up answers to users’ prompts.

....Tech companies building generative-AI tools have generally argued that content available on the open internet can be used to train their technologies under a legal provision called “fair use,” which allows for copyrighted material to be used without permission in certain circumstances. In its suit, the Times said the fair use argument shouldn’t apply because the AI tools can serve up, almost verbatim, large chunks of text from Times news articles.

I've expressed my opinion before that mere training is indeed fair use. If things were otherwise, everybody would be liable who had ever learned something from a Times article and then used it to publicly form an opinion or analysis.

But routinely serving up large verbatim chunks of stories is a different thing. This got me curious, so I asked GPT 4 about the Times' coverage of the James Comey letter regarding Hillary Clinton at the tail end of the 2016 presidential campaign:

In October 2016, the New York Times and other sources reported on a letter from then-FBI Director James Comey regarding the reopening of the investigation into Hillary Clinton's use of a private email server.

....The letter by Comey was criticized for its vagueness, as it acknowledged that the FBI did not know the contents of the emails in question or whether they were relevant to the investigation.

Etc.

That seemed fine, so I asked what the Times had actually said:

I'm unable to access the full content of the New York Times article from October 2016 regarding James Comey's letter about the Hillary Clinton email investigation. For detailed information and to read the article, I recommend visiting the New York Times website or accessing their archives directly.

This is just one query and might not represent typical behavior. But if it does, it appears that GPT 4 reads stuff from the Times and then summarizes it briefly the same way any human would do. Unless there's a lot more to it, including the ability to essentially act as a substitute for reading the Times, I still don't see anything wrong here. Treating this as copyright infringement would set an enormously dangerous precedent.

POSTSCRIPT: The lawsuit is here. It includes some examples of large-scale copying, mostly from Bing and mostly produced by laboriously asking for single sentences or paragraphs at a time. I'm skeptical that this is truly serious infringement since it's so artificial, but I can at least imagine a judge enjoining Microsoft from reproducing so much content. Overall, though, I remain unconvinced that training, summarization, and brief excerpts should be barred.

Film critic Richard Zoglin takes to the Washington Post today to echo one of my pet peeves. It happened on an outing to see Napoleon:

I counted a dozen commercials, for everything from Hyundai to M&M’s, before the Regal voice of God told us to silence our cellphones and “enjoy the show” — after which came another slew of ads, for various Pepsi drinks; six trailers for upcoming movies (because where better than a screening of “Napoleon” to look for fans of “Drive-Away Dolls” and “Aquaman and the Lost Kingdom”?); and a pitch for the Regal Unlimited subscription plan. Thirty minutes in hard-sell captivity before the movie finally started.

This whole thing has gotten completely out of hand. I can't say it's the biggest reason I see so few movies these days, but it's definitely one of them.

And as long as I'm complaining about corporate suckitude, let's talk about airlines. Marian and I are going to Europe this spring and last night I bought the tickets. After laying out several thousand dollars, I clicked to choose our seats and almost fell off my chair. Finnair charged me $500 to reserve our seats to Helsinki and then on to Prague. Not special seats. This was merely the standard fee to reserve any seat at all. Then it was another $100 to British Airways for the leg home from Vienna to London.¹

The last time we went to Europe was only two years ago and it didn't involve any of this crap. Is this because things have changed in a mere two years? Or because some airlines charge and others don't? Whatever the case, it amounted to a 10% surcharge just to get a seat assignment. wtf?

¹But American Airlines didn't charge anything for the final leg home from London to LA. Is this because they have a different policy? Because I booked the tickets through American? Because I'm a lifetime Advantage Gold member from my old road warrior days? I don't know.