Geological Data Interpretation with Python (III) – Diversity Index (open source)
The Python code provided in this article computes the Shannon-Wiener diversity index, a measurement frequently used by ecological and paleo-ecological research. It is my opinion that the interpretation of Evenness (a measurement derived from the diversity index) is more straightforward. However, one should be aware of the number of species and total abundance of each of the analyzed sample. An Evenness of one with one species present does not mean a perfect/normal environment.
The code herein is definitely not the cleanest and fastest code that can be written. However, it is faster to run this rather than implementing this from scratch in EXCEL. So please be free to use this. Make sure that you change the LINE:
?f = open(r'C:\\Users\Emil\PycharmProjects\pythonTurtle\Data\HydroCrbVentsDataSet.txt', "r") so it points to the file you want to analyze.
File type needs to be of this format:
领英推荐
Create this in excel and save it as tab delimited file.
It is my hope that you will find this useful. Use the code the way you want but be aware that even open source programming languages like python have a legal side to them. Please ask question if you have them. Constructive criticism is appreciated.?
# this code computes Shannon Wiener Diversity Index, Maximum Diversity Possible, Eveness, Species Richness and Total Abundance
# the code plots the DIversity Index: H, Eveness, Species Richness: S, and Total Abundance
import pandas as pd
import numpy as np
from matplotlib.ticker import StrMethodFormatter
from pathlib import Path
from matplotlib.figure import Figure
import matplotlib.pyplot as plt
import math as mat
depth = []
species = []
abund = []
totalAbund = []
sum = []
index = []
eve = []
linesplit = ""
sR = []
s = 0
f = open(r'C:\\Users\Emil\PycharmProjects\pythonTurtle\Data\HydroCrbVentsDataSet.txt', "r")
lines = f.readline()
linesplit = lines.split("\t")
no_of_species = (len(linesplit))
line_cnt = 0
data_line = 0
while True:
? ? line_cnt +=1
? ? if (line_cnt < no_of_species):
? ? ? ? species.append(linesplit[line_cnt].strip())
? ? else:
? ? ? ? break
? ? line_no = -1
? ? num_data = []
? ? vector = np.zeros((no_of_species))
? ? pILnpiVector = np.zeros((no_of_species))
for OneLine in f.readlines():
? ? vector = np.zeros((no_of_species))
? ? line_no += 1
? ? cnt = -1
? ? LN = OneLine.split("\t")
? ? #print(len(LN))
? ? while True:
? ? ? ? cnt += 1
? ? ? ? if cnt < len(LN):
? ? ? ? ? ? if cnt ==? 0:
? ? ? ? ? ? ? ? depth.append(LN[cnt])
? ? ? ? ? ? else:
? ? ? ? ? ? ? ? abund.append(LN[cnt].strip())
? ? ? ? ? ? ? ? vector[cnt-1] = (float)(LN[cnt].strip())? ? ? ? ? ? ??
? ? ? ? ? ? ? ? if (float)(LN[cnt].strip()) != 0:
? ? ? ? ? ? ? ? ? ? s +=1
? ? ? ? ? ? ??
? ? ? ? else:
? ? ? ? ? ? break
? ? num_data.append(abund)
? ? ? ? #total = np.sum(vector)? ? ? ? ? ? ? ? ? ?
? ? sR.append(s)
? ? s = 0
? ? total = np.sum(vector)
? ? totalAbund.append(total)
? ??
? ??
? ? for A in num_data:
? ? ? ? vcnt = -1
? ? ? ? for a in A:
? ? ? ? ? ? vcnt +=1
? ? ? ? ? ? try:
? ? ? ? ? ? ? ? pi = float(a)/float(total)
? ? ? ? ? ? except:?
? ? ? ? ? ? ? ? pi = 0
? ? ? ? ? ? if pi == 0:
? ? ? ? ? ? ? ? pILnpiVector[vcnt] = 0;
? ? ? ? ? ? ? ??
? ? ? ? ? ? else:
? ? ? ? ? ? ? ? pILnpiVector[vcnt] = pi*(mat.log(pi))? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
? ? data_line =? line_no? ? ? ? ? ? ? ? ? ? ? ? ??
? ? #print(np.sum(pILnpiVector))
? ? index.append(abs(np.sum(pILnpiVector)))
? ? abund = []
for x in range(len(depth)):
? ? if sR[x] != 0:
? ? ? ? Hmax = mat.log(sR[x])
? ? ? ? eve.append(index[x]/(Hmax))
? ? else:
? ? ? ? Hmax = 0
? ? ? ? eve.append(Hmax)
f.close
graph, (plot1, plot2, plot3, plot4) = plt.subplots(1, 4)
plot1.plot(index, depth, color = 'red', label = 'Species Diversity')
plot2.plot(eve, depth, color = 'blue', label = 'Eveness')
plot3.plot(sR, depth, color = 'green', label = 'Species Richness')
plot4.plot(totalAbund, depth, color = 'black', label = 'Abundance')
plot1.set_title ('H')
plot2.set_title ('Eveness')
plot3.set_title ('S')
plot4.set_title ('TotalAbund')
plot1.invert_yaxis()
plot2.invert_yaxis()
plot3.invert_yaxis()
plot4.invert_yaxis()
#print(totalAbund)
graph.tight_layout()
plt.show()
That's it. Have fun!
?Geoscience ?Carbon Management Leader ?Biostratigraphy ?Modeling-Data Integration ?Paleoenvironment ?Carbon Cycle /Climate Change ?Industry Geoscience Applications ?Data Science
3 年I have shared on the Cushman Foundation’s social media pages
Geologist with a flair for fossils and geologic time.
3 年Emil, thanks for sharing this!