Apolar Compounds
Linear Alkanes
Linear alkanes ranging from methane to tetradecane are primary components of gas and petroleum, primarily believed to be of biological origin However, tracing their sources is complicated due to thermodynamic cracking processes influenced by mineral catalysts, alongside non-biological geochemical sources Notably, molecular fossils with carbon chains from pentadecane to pentatriacontane exhibit a higher ratio of odd- to even-numbered alkanes, a trend also seen in modern n-alkanes, suggesting specific biosynthesis pathways While Fischer-Tropsch-type abiotic synthesis from aqueous oxalic acid produces homologous n-alkanes without an odd/even preference, n-alkane molecular fossils can act as biomarkers indicating both biological origins and depositional environments, with specific alkanes being indicative of lacustrine or marine phytoplankton and certain tropical marine settings.
Non-marine lacustrine algae predominantly consist of specific compounds, while terrestrial higher plants are abundant in others, primarily derived from leaf waxes Red and green algae exhibit a significant presence of certain substances, whereas brown algae are particularly rich in different compounds.
Figure 3 illustrates linear alkanes as molecular fossils and biomarkers, featuring a scanning electron microscopic image of the coccosphere of the alga Pleurochrysis carterae, formerly known as Syracosphaera carterae This image, provided by J R Young from UCL in London, highlights the intricate structure of this coccolithophoride alga, which primarily produces significant biological markers.
The relative amounts of specific linear alkanes can be utilized to calculate indices that indicate the sources of these compounds One such index is the terrigenic/aquatic ratio (TAR), which is defined as TAR = ([11] + [12] + [13]) / ([5] + [6] + [7]) This ratio helps assess the contribution of terrestrial (TAR > 1) versus aquatic (TAR < 1) flora to the n-alkane mixture found in sediment samples It is particularly valuable for tracking environmental changes across different sediment layers However, it is important to note that this index is sensitive to thermal processes and biodegradation, necessitating careful interpretation.
Recent research on linear alkane biomarkers in modern plants indicates that the distribution of leaf waxes can be indicative of specific plant types, although significant caution is necessary due to the high annual variability in chain-length distribution Notably, Sphagnum (peat mosses) predominantly features alkanes with chain lengths of 9 and 10 Additionally, alkanes with chain lengths of 9, 10, 11, and 12 have been identified in both the Early Triassic horsetail Equisetum brongniartii from Northern Vosges, France, and the contemporary Equisetum sylvaticum.
A useful distinction of sources can be derived from carbon isotope analysis conducted on terrestrial plant-derivedn-alkanes It turns out thatδ 13 Cẳ[([ 13 C]/
[ 12 C]) sample /([ 13 C]/[ 12 C]) standard 1]1000‰ amounts to –25‰for C 4 -plants like grasses [15].
The n-alkane fossils may stem from unaltered leaf waxes or n-alkanes produced in small amounts by various organisms Alternatively, they can result from diagenetic processes, which include the defunctionalization of fatty alcohols or the decarboxylation of fatty acids.
Branched Alkanes
Branched alkanes, specifically the iso- and anteiso-methyl types, are present in crude oils but do not serve as definitive biomarkers This is due to their ambiguous origins, as they can arise from both biological processes and abiogenic reactions, such as isomerizations and thermal cracking.
Branched alkanes can originate from the decarboxylation of iso- and anteiso-fatty acids ranging from C15 to C25, which are typical in demosponges Additionally, cyanobacteria produce monomethyl alkanes with the methyl group positioned further inward, resulting in the formation of 4- to 8-methyl alkanes, including iso- and anteiso-alkanes.
Fig 4 iso- and anteiso-Alkanes (n ẳ 9–19) by the methylhexadecanes shown in Fig.5up to the series of methylhenicosanes
[19] These are found in rather recent sediments associated with microbial mats
[20], but can be traced also down to the Infracambrian [21].
The Permian to Carboniferous torbanites, particularly from Glen Davis in the Sydney basin, are notable for their high organic content, comprising up to 90% fossil remains of the colonial alga Botryococcus braunii The methyl alkanes extracted from these sediments feature chain lengths ranging from 23 to 31 carbon atoms, with methyl groups located at various positions It is hypothesized that these homologous series of methyl-n-alkanes originate from the botryals produced by B braunii through microbial and diagenetic transformations, highlighting their significance as biomarkers for this alga.
Fig 5 The 4- to 8-methylhexadecane isomers produced by cyanobacteria
Fig 6 Formation of monomethylalkanes 17 and 18 from botryals 16
The monosubstituted alkane series, resembling the letter T and featuring chains with residues larger than methyl, have been identified in crude oils, with a total of 163 isomers recognized, including C10–C20 compounds with a single n-alkyl branch and C21–C25 isomers containing one ethyl branch However, these compounds are likely products of thermodynamic equilibration rather than biological processes, making them unsuitable as biomarkers Additionally, rare branched n-alkanes have been found in Early Cretaceous black shales, deep-sea hydrothermal waters, and various shales spanning over 800 million years, indicating their widespread presence in the geological record These branched series include 2,2-dimethylalkanes with even carbon numbers from 16 to 26 and 3,3- and 5,5-diethylalkanes with odd carbon numbers from 15 to 29, which are indicative of biosynthetic origins The precise source compounds and organisms responsible for these geminally substituted compounds, characterized by a unique quaternary carbon atom, remain unknown, yet their paleogeographic distribution suggests they may originate from non-photosynthesizing, sulfide-oxidizing organisms.
Multiply methyl branched n-alkanes from oil or sediments are identified as molecular fossils of isoprenoids due to their distinctive skeletal patterns The initial group of isoprenoid compounds includes linear isoprenoids linked head-to-tail Hydrolysis of the phytol chain from chlorophylls a or b, bacteriochlorophylls c or d, or phytanyl ethers from methanogens produces phytol, which can undergo diagenesis In oxic environments, pristane is formed through the oxidation of phytenic acid, while suboxic saline environments yield phytane via dihydrophytol The ratio of pristane to phytane serves as an indicator of oxic versus suboxic depositional environments, though these compounds are limited in their use as specific biomarkers for photosynthesizing organisms.
Pristane can be produced through the reductive cleavage of α-tocopherol, a common antioxidant found in plants and algae, initiated by thermal electrocyclic ring opening However, due to the dominance of chlorophylls in photosynthesizing organisms, α-tocopherol is likely a minor source of pristane Additionally, marine sediments may contain pristane resulting from the metabolic transformation of phytol, which is biosynthesized by the zooplankton Calanus hyperboreus, as this organism utilizes phytol to adjust its buoyancy in the water column.
Regular head-to-tail linear isoprenoids with carbon chains ranging from 13 to 20 are commonly found in crude oils and source rock extracts, though their specific origins remain unidentified These compounds may potentially derive from farnesyl residues.
Fig 7 Formation of pristane (17) and phytane
(3) in (a) oxic and (b) suboxic environments.
(length ~8 mm), which produces 16 for better buoyancy; photograph:
Pristane (17) can be formed from α-tocopherol (22) or specific isoprenoid aldehydes and ketones The ratio of these components is often utilized to evaluate the correlation of oil sources Additionally, linear isoprenoids with carbon chain lengths exceeding 20 are effective indicators of saline environments, exemplified by the presence of a C25 isoprenoid hydrocarbon.
Recent studies suggest that certain linear isoprenoid compounds, particularly those linked head-to-head and tail-to-tail, may originate from halophilic Archaea Notably, head-to-head linear isoprenoids with carbon chains ranging from 28 to 39 have been identified in Jurassic oils from Siberia These compounds, formed by the linking of pristane or phytane to larger carbon units, serve as specific biomarkers for Archaea An example is biphytane, which derives from glycerol diether or tetraether lipids, integral to the structural makeup of archaeal membranes.
Crocetane, an irregular tail-to-tail isoprenoid formed through the diagenetic reduction of crocetene, serves as a biomarker for anaerobic methanotrophic and methanogenic Archaea Additionally, 2,6,10,19-tetramethylicosane is another significant indicator in this context.
The archaean biomarker biphytane, an irregular head-to-head isoprenoid, and its biogenic precursors, found in the Maastrichtian-Danian shale of the Californian Moreno Formation, raise questions about their authenticity and origin Additionally, the structurally related 2,6,15,19-tetramethylicosane, discovered in the Albian black Niveau Paquier shale in southeastern France, also lacks clarity regarding its biological origin, similar to the 10-ethyl-2,6,15,19-tetramethylicosane identified in the same sediments.
Squalane (31) was identified from crude oil [36], but its ubiquitously occurring precursor squalene (32) makes it hardly a specific biomarker However, high concentrations of31in sediments that are also rich in 2,6,10,19-tetramethylicosane
Research indicates an archaean origin, with two specific tetramethylsqualanes—3,7,18,22-tetramethylsqualane and 3,7,11,14-tetramethylsqualene—extracted from Sumatran crude oil sourced from the Miocene to Pleistocene Duri and Minas wells in the central basin These compounds are closely associated with Botryococcus braunii and suggest a lacustrine origin Furthermore, similar anoxic conditions are observed in lacustrine sediments, such as those found in the renowned German Eocene Messel.
Irregular tail-to-tail isoprenoids, including crocetane, crocetene, and alkyl-substituted icosanes, along with Australian Miocene Condor oil shales, contribute to the significant presence of lycopane The primary source of lycopane is the carotenoid lycopene, which is prevalent in various microorganisms and plants, indicating its low organismic specificity.
Highly branched isoprenoids serve as potential biomarkers for diatoms found in Quaternary to Jurassic source rocks and their corresponding oils The primary types are distinguished by their carbon contents, specifically C20, C25, and C30 The biological origin of these isoprenoids is supported by the discovery of various unsaturated derivatives in cultures of the diatoms Haslea ostrearia and Rhizosolenia setigera Notably, trienes, tetraenes, and pentaenes are preferentially produced, highlighting the significance of these compounds in geological studies.
Irregular tail-to-tail isoprenoids, such as pentamethyl-7-(3-methylpentyl)nonadeca-2,5,9,12,16-pentaene, are significant in geochemical studies These unsaturated derivatives can easily undergo diagenetic reduction to form saturated compounds Originating from diatoms that evolved in the Jurassic period, these highly branched isoprenoids serve as valuable tools for addressing age-related inquiries and enhancing chemostratigraphic investigations.
Fig 12 Highly branched isoprenoids 37–39 and the pentaene 40 Bottom: Haslea ostrearia;photograph courtesy of K Wenderoth (Ebsdorfergrund, Germany)
Alicyclic Compounds
The molecular fossils to be discussed in this section are either simple alkyl substituted cycloalkanes or belong to the terpenoids, steroids, and carotenoids— groups containing vast numbers of compounds.
Alkyl substituted cycloalkanes are found in sediments and crude oils from various geological periods, representing a legacy of ancient life forms While these compounds cannot always be linked to specific biological sources, they are considered molecular fossils Their formation is likely a result of thermal processes that occur during the catagenesis of petroleum.
An intriguing study of the Cretaceous/Paleogene boundary sediments at Kawaruppu, Hokkaido, Japan, revealed over sixty identified alkyl cycloalkanes This includes various homologous series such as alkyl cyclohexanes, dialkylcyclohexanes, and alkylcyclopentanes, as well as decalin derivatives and bicyclo[3.3.1]nonane Notably, the distribution of these compounds differs between sediment layers from the Cretaceous and Paleogene periods, suggesting a connection to the significant mass extinction event that marked the end of the Cretaceous and the extinction of the dinosaurs.
A series of non-isoprenoid macrocyclic alkanes, ranging from cyclopentadecane to cyclotetratriacontane, along with monomethylated homologs from cycloheptadecane to cyclohexacosane, were identified in a Carboniferous torbanite enriched with Botryococcus braunii fossils The presence of these compounds clearly indicates their origin from this specific organism.
There are only few examples of cyclic sesquiterpanes found as molecular fossils. Thus, two bicyclic sesquiterpanoids, drimane (109) and 4β-eudesmane (110) (Fig.
In the Cretaceous Cormorant Field located in the Gippsland Basin, east of Melbourne, Victoria, Australia, prokaryotic compounds and higher plant terpenes were identified in oil samples Additionally, various sesquiterpene derivatives have been found in sediments, lignites, and petroleum, although their origins do not appear to be linked to specific organisms.
Cembrene (113) and its related compounds are identified as products of mild alkaline hydrolysis of kerogen found in lacustrine sediment from the Nördlinger Ries in southern Germany These compounds likely originate from resinous plants that produce cembrene derivatives, which are typical of a semiarid climate.
Cadinane and two bicadinane isomers, identified as 115 and 116, are bicyclic diterpane molecular fossils discovered in Miocene rock extracts and oils from Southeast Asia Cadinane is believed to derive from gymnosperm dammar resins, establishing it as a significant biomarker for these plants Additionally, cadinane and the bicadinanes were also located in Late Triassic to Middle Jurassic oils from the Perth Basin in Western Australia.
Fig 15 Alkylcyclopentanes 86–103, decalins 104 and 105, bicyclo[3.3.1]nonane (106), and the cycloalkanes 107 and methylcycloalkanes 108
Fichtelite (117), the most notable tricyclic diterpane, was first identified in 1841 as a rare organic mineral associated with pine wood residues in Bavarian peat bogs Its name reflects its geographical origin The structure of fichtelite was elucidated by Nobel laureate Leopold Ruzicka during his pivotal research on isoprenoids This biomarker is recognized as a diagenetic product of abietic acid (118), the primary component of pine resin Noteworthy is the elegant synthesis of fichtelite by Taber and Saleh Additionally, related organic minerals, such as simonellite (119), are often found alongside fichtelite in conifer resin fossils.
Fig 16 Drimane (109), 4 β -eudesmane (110), and sesquiterpane derivatives 111 and 112
BA.45, named after its discoverer Vittorio Simonelli, and retene, also known as phylloretine, are significant organic compounds that indicate wood fires Additionally, the rare organic mineral dinite, classified as Nickel-Strunz class 10.BA.15, is often found alongside bituminous fossil wood and coal.
Fichtelite (117) and abietic acid (118) are showcased in the Mineral Collection of the Museum of Natural History in Vienna, with a photograph by Vera Hammer The left plate features Picea abies, the tree source of pitch, which primarily contains abietic acid (118), captured in a photograph by H Falk.
115 (trans,trans,trans,trans,trans- bicadinane)
116 (trans,trans,cis,cis,trans- bicadinane)
Cadinane and bicadinanes have been discovered in river sediments in Castelnuovo di Garfagna, Tuscany, Italy, dating back to the Late Miocene Additionally, fichtelite, along with various bicyclic diterpanes, was identified in crude oil and source rock from the Late Cretaceous Gippsland Basin in Australia, where one of the compounds was referred to as isopimarane.
Recent studies have identified several tetracyclic diterpanoids alongside previously mentioned bicyclic diterpanoids, which are linked to conifer leaf resins from the Podocarpaceae, Araucariaceae, and Cupressaceae families Notably, phyllocladanes are characteristic of resins from Podocarpus and Dacrydium, while kauranes have been discovered in Agathis These markers were also present in Early Carboniferous Gondwana coal found in Niger The widespread distribution of tetracyclic diterpenes is supported by research conducted by Simoneit et al., who examined Devonian coal samples from Luquan, China, noting their structural similarity to gibberellins, common plant hormones Given that Gymnospermae did not exist during the Devonian, it is hypothesized that these tetracyclic terpanes may originate from microbial, fungal, or lower plant sources Beyerane is a predominant terpane in Saarland coals, while Ruhr coal contains higher levels of kauranes Both coal types date back to the Carboniferous period, prior to the evolution of the aforementioned conifer families, suggesting that these diterpanes may stem from early conifers of the Voltziales order.
The tetracyclic organic mineral hartite (136), classified under Nickel-Strunz class 10.BA.10, was described in 1841 by Wilhelm von Haidinger, who named it after the Oberhart coal mine in Lower Austria This discovery, along with the synthesis of fichtelite (117), highlights the significant advancements in mineralogy during this period.
Tricyclic diterpane biomarkers, specifically simonellite from Monte Pulciano, Toscana, Italy, are highlighted in Fig 21 This specimen, part of the Mineral Collection at the Museum of Natural History in Vienna (inv no J6938), features a photograph by Vera Hammer Additionally, the organic mineral component, which is also derived from wood, is associated with macrofossils.
Cheilanthanes, a group of tricyclic terpanes named after the fern Cheilanthes farinosa, consists of compounds that can reach up to C45 and were first identified in Paleogene Californian oil The primary member of this group is compound 137, while others, such as 138 and 139, have been isolated from oil sand.
Tetracyclic diterpanes, specifically compounds 128–136, include hartite (136), which appears as white specks on coal from Oberhart near Gloggnitz, Austria Recent studies have characterized compounds 140 and 141 in the sediments of Lake Cadagno, Switzerland Potential biological sources for these compounds include plant-derived di-(E)-poly-(Z)-polyprenols, along with a diverse range of over fifty cheilanthanes found in various marine organisms that may also contribute to these molecular fossils.
Other Aromatic and Heterocyclic Compounds
From Simple Aromatic Compounds to Polyaromatic Hydrocarbons
The first group of molecular fossils to be examined are phenyl alkanes, which feature an aromatic unit not previously described Due to the widespread occurrence of linear alkyl sulfonate detergents in the geosphere, caution is necessary when evaluating these compounds as molecular fossils However, research has conclusively shown that phenyl alkane 318, with substitutions at positions C-2 to C-6 and carbon numbers ranging from 13, is particularly relevant in the context of Australian crude oils and sediments.
Contemporary phenyl alkane detergents are authentic descendants of ancient life, as they typically contain alkane lengths of only ten to fourteen carbon atoms Research indicates that phenyl alkanes likely originate from algal species such as Botryococcus braunii or related organisms like Thermoplasma acidophilum bacteria.
Recent studies have identified n-alkanes with chain lengths up to 25 carbon atoms, substituted at position C-1 with 1- or 2-naphthyl groups, in Cretaceous sedimentary source rocks of the onshore Songliao Basin in northeastern China These compounds may originate from microbial and terrestrial terpenoid sources, but their presence could also result from ring isomerizations and transalkylation processes as the maturity of the source rocks increases.
1,2,7-Trimethylnaphthalene and 1,2,5-trimethylnaphthalene are diagenetic products derived from oleanene-type triterpenoids, such as β-amyrin, highlighting the contribution of angiosperms, including dandelion (Taraxacum officinale) Additionally, certain oils contain 1,3,6,7-tetramethylnaphthalene, further indicating the complex biochemical processes involved in their formation.
50) is encountered but seems to be derived from bacterial sources [152].
Five very interesting isohexyl-alkylnaphthalenes (325–329) were identified in a suite of crude oils from the Cambrian to the Paleogene from around the world (Fig.
The presence of the benzene derivative from the Permian period suggests origins from both bacterial or algal sources and higher plants In contrast, the alkylnaphthalenes are exclusively linked to di- and triterpenoid precursors from higher plants.
The structural requirements for producing certain products through ring opening include a terpenoid A–B ring system with a geminal dimethyl group at position C-4 and an angular methyl group at position C-10, as demonstrated by model reactions using natural products like phyllocladene and olean-18-ene Additionally, from Devonian to Cretaceous crude oils and partially saturated coals, condensed aromatic compounds such as methyltetralin and methylindan isomers have been identified, indicating a biological origin, though their specific source remains unclear.
Fig 50 Phenyl- and naphth-1- and 2-yl-n-alkanes 318 and 319 and 320 Trimethylnaphthalenes
322 and 323 derive from β -amyrin (321) (with one of its sources, Taraxacum officinale; photo- graph: H Falk); 1,3,6,7-tetramethylnaphthalene (324)
Fig 51 Isohexyl aromatic molecular fossils: 2-methyl-1-(4-methylpentyl)naphthalene (325), 2,6-dimethyl-1-(4-methylpentyl)naphthalene (326), 6-ethyl-2-methyl-1-(4-methylpentyl)naphtha- lene (327), 6-i-propyl-2-methyl-1-(4-methylpentyl)naphthalene (328), 2-ethyl-4,6-dimethyl-1- (4-methylpentyl)naphthalene (329), and 1,2,4-trimethyl-3-(4-methylpentyl)benzene (330)
Fig 52 Mechanistic details of the ring opening of terpenes to yield isohexyl substituted benzenes and naphthalenes with natural products 331 and 332 showing the respective prerequisites
The isomers studied were determined to be racemic, similar to the 1-methyl- and 1,3-dimethylindans Notably, the concentration of 2-methyltetralin was approximately double that of 1-methyltetralin Additionally, in the case of the indans, the cis-336 isomer was present in a slight excess, ranging from 52% to 70%, compared to the trans isomer.
336 isomer 1,1,5,6-Tetramethylnaphthalene (337) and cadalene (338) originating from cadinane (114), a marker for dammar resin-producing gymnosperms, are general biomarkers for higher plants [155].
Phenanthrene molecular fossils like fichtelite (117) (see Fig 19), simonellite
(119), and retene (120) (see Fig 21), derived from abietic acid (118), were
Tetralin and indan derivatives, along with cadalene and perylene, are discussed in the context of their potential origins from both conifers and ancient algal and bacterial biomass Notably, fluorene and phenanthrene, classified as organic minerals kratochvilite and ravatite, are excluded from specific natural product attribution due to their status as molecular fossils, which may also have abiogenic origins This leads to a focus on higher polyaromatic systems, particularly perylene.
Perylene (339) is a prevalent and condensed compound found in sediments, as evidenced by a study in the anoxic basin of Saanich Inlet, British Columbia, which identified its presence in both marine and terrestrial strata This compound forms through microbial-mediated diagenesis, although the specific organisms responsible for its precursors remain unidentified, as it can originate from both aquatic and terrestrial organic materials or through other microbial processes Additionally, research on Late Triassic to Middle Jurassic sediments in the Northern Carnarvon Basin, Australia, indicated a diagenetic origin of perylene at approximately 0.6 ppm.
339may be traced to mostly terrestrial sources like fungi [158] These organisms may biosynthesize binaphthyls or perylenequinones like hypocrellin (340), cercosporin (341), shiraiachrome (342), stemphyperylenol (343), or erythroaphin
A recent in-depth study of a Holocene sediment profile from the Qingpu trench in the Yangtze River Delta, China, has shed light on the controversial relationship between sediment concentration and wood-degrading fungi This research analyzed sediment samples dating back to the Devonian period, revealing a clear connection between the concentration of specific sediments and the presence of these fungi.
Sediments, coal deposits, and crude oils are significant sources of polyaromatic hydrocarbons (PAHs), which can vary in composition (345–355) These compounds may originate from extensive chemical transformations of natural organic materials, combustion, or thermal processes Additionally, PAHs can also result from non-biogenic processes, such as the "zigzag process," which involves the addition of acetylene and butadiene radicals to produce the most thermodynamically stable PAHs.
Polyaromatic hydrocarbons can serve as molecular fossils, although some may originate from non-biogenic sources Unlike most polyaromatic hydrocarbons, 1,2,3,4,5,6-hexahydrophenanthro[1,10,9,8-opqra]perylene (356) is a specific diagenetic molecular fossil and a biomarker for hypericinoids, which will be further explored in Section 2.2.3 A synthesis and X-ray structural analysis of compound 356 have been conducted to confirm its identity, revealing a spatial structure characterized by distorted saturated rings due to steric strain.
345 (anthracene) 346 (benzo[a]anthracene) 347 (benzo[a]pyrene)
354 (benzo[ghi]perylene) 355 (coronene) 356 (1,2,3,4,5,6- hexahydrophenanthro[1,10,9,8- opqra]perylene)
Fig 54 Polyaromatic hydrocarbons 345–356 and bottom: X-ray analytical structure of 356 [163]
Since the groundbreaking research by Alfred Treibs, porphyrins, often referred to as "geoporphyrins," have undergone substantial evolution This area of study has notably advanced over the past few decades, reflecting significant developments in our understanding of these complex compounds.
The "zigzag" formation of polyaromatic hydrocarbons has been extensively studied since the advent of modern structural analysis methods like HPLC and NMR, with ongoing research flourishing into the twenty-first century Geoporphyrins are widely distributed in sediments, coals, and oils from various geological periods, with notable sources including the Devonian Henryville beds, the Permian Kupferschiefer in Germany, the Triassic Serpiano oil shale in Switzerland, and the Jurassic Toarcian shales of Europe Other significant deposits are found in the Cretaceous Julia Creek in Australia, the Eocene Messel and Green River formations in Germany and the USA, respectively, the Miocene Monterey Formation in California, and the Pliocene Willershausen deposits in Germany Furthermore, porphyrin derivatives have been proposed as promising biomarkers in the search for extraterrestrial life.
There are two main series of nearly one hundred known geoporphyrins: one series is diagenetically derived from heme, cytochrome, and other naturally occurring cyclic tetrapyrroles, while the other series originates from the diverse chlorophylls found in photosynthesizing organisms.