Japan is no stranger to earthquakes. The island itself is located at the converging point of four techtonic plates. As a result, the area experiences around 1,000 earthquakes a year, averaging about 3 a day. Most of these quakes go without any notice, the majority being below magnitude 5, but there is always the chance for much bigger quakes to occur.

Interactive Map of Earthquakes near Japan. Made with Folium

A magnitude 9.1 earthquake is an incredibly rare event in any part of the world and, according to the official position of the US Geological Survey, it is impossible to predict both time and location of such an event before it happens. This doesn't mean the probability of such an event can't be calculated though. The Gutenberg-Richter law states that the relationship between the magnitude of an earthquake and the logarithm of the probability that it happens is linear, so we can estimate the likeihood of such an event happening by creating a linear model

The Fukushima Daiichi Nuclear Accident

I've read online that during the safety analysis, the engineers at Fukushima used historical earthquake data dating back to 1600 CE to build a regression model to determine the likelihood of significant earthquakes. Instead of using the accepted Gutenberg-Richter model, they saw a kink in the data and assumed the appropriate regression was not linear but polynomial. The resulting model assumed that a magnitude 9 and above earthquake was essentailly impossible, occuring once within a 19,000 year period.

I haven't found much evidence to support this other than a few sources such as Berkeley Machine Learning Crash Course which cites a paper by Brian Stacey titled Fukushima: The Failure of Predictive Models which then cites Nate Silver's book, The Signal and the Noise: Why So Many Predictions Fail — but Some Don't and a report by M.Ragheb titled FUKUSHIMA EARTHQUAKE AND TSUNAMI STATION BLACKOUT ACCIDENT

What I can tell is that the station was not designed to widthstand a magnitude 9 earthquake. The 3/11 earthquakes produced a tsunami towering more than 14 meters (46 feet). The safety design team desgined the plan to be able to widthstand a magnitude 8.6, which would have generated a tsunami 5.7 meters tall, so the tsunami was easily able to breach the 6 meter tall seawall surrounding the plant.

The assuption that these types of quakes were impossible wasn't uncommon. Emile Okal, a earthquake and tsunami expert explains some of the rational.

Essentially, the feeling was that — if you had a very, very old ocean floor — when it was eventually recycled into the mantle of the Earth, you wouldn’t get these mega-earthquakes.

In his book, Nate Silver uses data from January 1st, 1964 to March 11th, 2011 to recreate a similar polynomial model that the safety engineers have assumed to have used. Nate uses data from the US Geological Survey and pulls a 1 degree by 1 degree box of data around the origin of the 3/11 earthquake. We can use the USGS API to do the same

min_lat, max_lat = 38.32 - 0.5, 38.32 + 0.5
min_long, max_long = 142.37 - 0.5, 142.37 + 0.5
start_time = datetime.datetime(1964,1,1)
end_time = datetime.datetime(2011,3,11)
url = f"""\
https://earthquake.usgs.gov/fdsnws/event/1/\
query.csv?\
starttime={start_time.strftime("%Y-%m-%d")}&\
endtime={end_time.strftime("%Y-%m-%d")}&\
maxlatitude={max_lat}&\
minlatitude={min_lat}&\
maxlongitude={max_long}&\
minlongitude={min_long}&\
minmagnitude=4.5&\
eventtype=earthquake&\
orderby=magnitude&\
limit=20000"""

df = pd.read_csv(url)
df.describe().T
count mean std min 25% 50% 75% max
latitude 149.0 38.363785 0.281879 37.822 38.149 38.387 38.578 38.819
longitude 149.0 142.335181 0.304166 141.884 142.075 142.273 142.623 142.842
depth 149.0 43.355034 17.329300 5.000 33.000 42.000 51.400 102.800
mag 149.0 5.109060 0.648253 4.500 4.600 4.900 5.300 7.700
nst 39.0 181.948718 171.303048 9.000 40.000 130.000 296.000 698.000
gap 35.0 96.257143 44.071208 17.400 64.600 105.000 129.400 212.900
dmin 0.0 NaN NaN NaN NaN NaN NaN NaN
rms 93.0 0.953978 0.231602 0.500 0.800 0.930 1.100 1.600
horizontalError 0.0 NaN NaN NaN NaN NaN NaN NaN
depthError 45.0 7.373333 4.072212 2.000 5.100 5.900 7.600 20.600
magError 8.0 0.346250 0.155466 0.200 0.230 0.275 0.485 0.580
magNst 61.0 17.131148 32.103725 1.000 3.000 6.000 20.000 225.000

This data pull only includes only 149 earthquakes, which doesn't seem robust enough for a proper analysis. Instead, I'll use this data source from Kaggle and filter by the same criteria.

count mean std min 25% 50% 75% max
latitude 5483.0 38.331719 8.039854 23.667 30.1765 39.298 46.3065 50.789
longitude 5483.0 144.591140 7.748972 124.293 141.0780 143.441 152.6375 158.670
depth 5483.0 59.023582 92.162826 0.000 10.0000 33.000 54.5500 631.900
mag 5483.0 4.849389 0.410098 4.500 4.6000 4.700 5.0000 8.300
nst 5482.0 116.054360 123.964728 5.000 32.0000 70.000 151.0000 929.000
gap 4702.0 106.654126 37.773677 9.900 82.6250 113.000 133.6000 306.600
dmin 0.0 NaN NaN NaN NaN NaN NaN NaN
rms 5405.0 0.904420 0.191105 0.120 0.7800 0.880 1.0100 1.870
horizontalError 0.0 NaN NaN NaN NaN NaN NaN NaN
depthError 1808.0 10.173507 7.412728 0.000 5.5000 8.100 12.8000 70.700
magError 0.0 NaN NaN NaN NaN NaN NaN NaN
magNst 3501.0 25.744073 30.298433 1.000 5.0000 15.000 36.0000 274.000

We will want to convert the data to a frequency per year basis, so some data manipulation will be required. We can groupby mag and generate a count for the type of event.

Yearly Frequency
Magnitude
4.5 114.054666
4.6 93.817577
4.7 77.117066
4.8 63.854895
4.9 46.663193
5.0 33.499260
5.1 24.952528
5.2 16.405796
5.3 14.342792
5.4 9.922068
5.5 8.939685
5.6 6.287251
5.7 6.581966
5.8 3.340102
5.9 4.420724
6.0 3.241864
6.1 2.455958
6.2 1.277098
6.3 1.277098
6.4 1.080621
6.5 0.884145
6.6 0.785906
6.7 0.392953
6.8 0.785906
6.9 0.392953
7.0 0.589430
7.1 0.098238
7.2 0.196477
7.3 0.294715
7.4 0.392953

As a fun exercise, I recreated the style that Nate Silver used in his book. Below is the code to generate the scatter plot.

fig, ax = plt.subplots(figsize=(16,9))
df_freq.reset_index().plot.scatter(x = 'Magnitude', 
                                   y ='Yearly Frequency', 
                                   logy=True, 
                                   ax = ax, 
                                   color = 'k', 
                                   s = 250, 
                                   marker = 'D', 
                                   clip_on=False)


plt.xticks(np.arange(4.5,9.6,0.5), size = 20)
plt.yticks(size=20)
ax.yaxis.set_major_formatter(ticker.FuncFormatter(lambda y,pos: ('{{:.{:1d}f}}'.format(int(np.maximum(-np.log10(y),0)))).format(y)))
ax.tick_params(width = 1, length = 10, which = 'major')
ax.tick_params(width = 0, length = 0, which = 'minor')
ax.tick_params(width = 1, length = 10, which = 'major')
ax.tick_params(width = 0, length = 0, which = 'major', axis = "x")

plt.hlines(1,0,10, color = 'k')

plt.xlim(4.5,9.5)
plt.ylim(0.0001,1000)

plt.grid(linestyle="--", linewidth = 2.5, alpha = 0.3)


plt.ylabel("Annual Frequency:\nEarthquakes of at least this magnitude", size = 20)
plt.xlabel("Magnitude",size = 20)
[ax.spines[i].set_visible(False) for i in ['top','bottom','right']]
plt.suptitle(f"{'TŌHOKU, JAPAN EARTHQUAKE FREQUENCIES'}\n{'JANUARY 1, 2001-MARCH 10,2011':<45}", 
             size = 20, 
             x = 0.2, 
             y = 1);

The chart doesn't exactly match Nate's due to the dataset being a shorter time time period (2000-2011) but it's a close enough match to create a linear model.

lm = LinearRegression()
X = np.array(df_freq.reset_index()['Magnitude']).reshape(-1,1)
y = np.log10(df_freq['Yearly Frequency'])
s = df_freq['Yearly Frequency']
lm.fit(X,y,s)
pred = 10 ** lm.predict(np.arange(0,11,1).reshape(-1,1)) 

The resulting model is a good approximation to Nate's with the estimated likihood of a magnitude earthquake happening once every 800 years while Nates's linear model predicts an event every 900 years.

Polynomial

Since I can't find too much info on the polynomial model, but the closest, least complicated model that can match Nate's polynomial model is
b4 mag^4 + b3 mag^3 + b2 mag^2 + b1 mag + b0. This model produces the following graph.