The Great Tohoku earthquake and the Fukushima Daiichi Accident
The Fukushima nuclear disaster was the most severe nuclear accident since Chernobyl. Together, they have been the only ones with a level 7 classification on the International Nuclear and Radiological Event Scale. In 2011, an earthquake followed by a tsunami caused the disaster in the Japanese plant, and it all traces back to a mistake in the safety model.
Japan is no stranger to earthquakes. The island itself is located at the converging point of four techtonic plates. As a result, the area experiences around 1,000 earthquakes a year, averaging about 3 a day. Most of these quakes go without any notice, the majority being below magnitude 5, but there is always the chance for much bigger quakes to occur.
A magnitude 9.1 earthquake is an incredibly rare event in any part of the world and, according to the official position of the US Geological Survey, it is impossible to predict both time and location of such an event before it happens. This doesn't mean the probability of such an event can't be calculated though. The Gutenberg-Richter law states that the relationship between the magnitude of an earthquake and the logarithm of the probability that it happens is linear, so we can estimate the likeihood of such an event happening by creating a linear model
I've read online that during the safety analysis, the engineers at Fukushima used historical earthquake data dating back to 1600 CE to build a regression model to determine the likelihood of significant earthquakes. Instead of using the accepted Gutenberg-Richter model, they saw a kink in the data and assumed the appropriate regression was not linear but polynomial. The resulting model assumed that a magnitude 9 and above earthquake was essentailly impossible, occuring once within a 19,000 year period.
I haven't found much evidence to support this other than a few sources such as Berkeley Machine Learning Crash Course which cites a paper by Brian Stacey titled Fukushima: The Failure of Predictive Models which then cites Nate Silver's book, The Signal and the Noise: Why So Many Predictions Fail — but Some Don't and a report by M.Ragheb titled FUKUSHIMA EARTHQUAKE AND TSUNAMI STATION BLACKOUT ACCIDENT
What I can tell is that the station was not designed to widthstand a magnitude 9 earthquake. The 3/11 earthquakes produced a tsunami towering more than 14 meters (46 feet). The safety design team desgined the plan to be able to widthstand a magnitude 8.6, which would have generated a tsunami 5.7 meters tall, so the tsunami was easily able to breach the 6 meter tall seawall surrounding the plant.
The assuption that these types of quakes were impossible wasn't uncommon. Emile Okal, a earthquake and tsunami expert explains some of the rational.
Essentially, the feeling was that — if you had a very, very old ocean floor — when it was eventually recycled into the mantle of the Earth, you wouldn’t get these mega-earthquakes.
In his book, Nate Silver uses data from January 1st, 1964 to March 11th, 2011 to recreate a similar polynomial model that the safety engineers have assumed to have used. Nate uses data from the US Geological Survey and pulls a 1 degree by 1 degree box of data around the origin of the 3/11 earthquake. We can use the USGS API to do the same
min_lat, max_lat = 38.32 - 0.5, 38.32 + 0.5
min_long, max_long = 142.37 - 0.5, 142.37 + 0.5
start_time = datetime.datetime(1964,1,1)
end_time = datetime.datetime(2011,3,11)
url = f"""\
https://earthquake.usgs.gov/fdsnws/event/1/\
query.csv?\
starttime={start_time.strftime("%Y-%m-%d")}&\
endtime={end_time.strftime("%Y-%m-%d")}&\
maxlatitude={max_lat}&\
minlatitude={min_lat}&\
maxlongitude={max_long}&\
minlongitude={min_long}&\
minmagnitude=4.5&\
eventtype=earthquake&\
orderby=magnitude&\
limit=20000"""
df = pd.read_csv(url)
df.describe().T
This data pull only includes only 149 earthquakes, which doesn't seem robust enough for a proper analysis. Instead, I'll use this data source from Kaggle and filter by the same criteria.
We will want to convert the data to a frequency per year basis, so some data manipulation will be required. We can groupby mag
and generate a count for the type of event.
As a fun exercise, I recreated the style that Nate Silver used in his book. Below is the code to generate the scatter plot.
fig, ax = plt.subplots(figsize=(16,9))
df_freq.reset_index().plot.scatter(x = 'Magnitude',
y ='Yearly Frequency',
logy=True,
ax = ax,
color = 'k',
s = 250,
marker = 'D',
clip_on=False)
plt.xticks(np.arange(4.5,9.6,0.5), size = 20)
plt.yticks(size=20)
ax.yaxis.set_major_formatter(ticker.FuncFormatter(lambda y,pos: ('{{:.{:1d}f}}'.format(int(np.maximum(-np.log10(y),0)))).format(y)))
ax.tick_params(width = 1, length = 10, which = 'major')
ax.tick_params(width = 0, length = 0, which = 'minor')
ax.tick_params(width = 1, length = 10, which = 'major')
ax.tick_params(width = 0, length = 0, which = 'major', axis = "x")
plt.hlines(1,0,10, color = 'k')
plt.xlim(4.5,9.5)
plt.ylim(0.0001,1000)
plt.grid(linestyle="--", linewidth = 2.5, alpha = 0.3)
plt.ylabel("Annual Frequency:\nEarthquakes of at least this magnitude", size = 20)
plt.xlabel("Magnitude",size = 20)
[ax.spines[i].set_visible(False) for i in ['top','bottom','right']]
plt.suptitle(f"{'TŌHOKU, JAPAN EARTHQUAKE FREQUENCIES'}\n{'JANUARY 1, 2001-MARCH 10,2011':<45}",
size = 20,
x = 0.2,
y = 1);
The chart doesn't exactly match Nate's due to the dataset being a shorter time time period (2000-2011) but it's a close enough match to create a linear model.
lm = LinearRegression()
X = np.array(df_freq.reset_index()['Magnitude']).reshape(-1,1)
y = np.log10(df_freq['Yearly Frequency'])
s = df_freq['Yearly Frequency']
lm.fit(X,y,s)
pred = 10 ** lm.predict(np.arange(0,11,1).reshape(-1,1))
The resulting model is a good approximation to Nate's with the estimated likihood of a magnitude earthquake happening once every 800 years while Nates's linear model predicts an event every 900 years.
Since I can't find too much info on the polynomial model, but the closest, least complicated model that can match Nate's polynomial model is
b4 mag^4 + b3 mag^3 + b2 mag^2 + b1 mag + b0
. This model produces the following graph.