The Geopolitics of Ancestry: Institutional Hegemony, Private Equity Integration, and the AI Frontier in Global Genealogical Data
The contemporary landscape of genealogical research has evolved from a decentralized collection of local parish records and family bibles into a sophisticated, multi-billion-dollar global industry characterized by centralized data repositories, high-frequency private equity transactions, and advanced computational linguistics. At the center of this transformation are two disparate but deeply intertwined entities: The Church of Jesus Christ of Latter-day Saints (LDS), a religious institution with an existential and theological mandate for record preservation, and Blackstone Inc., a global asset management titan that has identified genealogical data as a premier high-yield technology asset. This report provides an exhaustive analysis of the infrastructure, market dynamics, and technological advancements—specifically the integration of Artificial Intelligence (AI)—that define the current state of the genealogical industrial complex.
The Institutional Foundation: The LDS Church as a Global Data Sovereign
The Church of Jesus Christ of Latter-day Saints occupies a unique position in the global data economy. Unlike commercial entities, the LDS Church’s involvement in genealogy is rooted in the doctrine of "salvific work for the dead," which posits that living members can perform essential religious ordinances, such as baptism, on behalf of deceased ancestors.1 This theological necessity has driven the development of the world’s most robust genealogical infrastructure, centered on the FamilySearch International non-profit organization.
Quantitative Analysis of Volume and Reach
By the end of 2025, FamilySearch had solidified its status as the world’s largest genealogical repository, managing a volume of data that surpasses most national archives. The organization reported over 22.7 billion searchable names and images in its historical record collections, with a growth rate of approximately 2.2 billion new records per year.3 This expansion is not merely digital but is supported by a massive global network of physical and administrative infrastructure.
FamilySearch Global Infrastructure and Volume Metrics (2025) | Statistics and Details |
Total Searchable Names and Images | 22.7 Billion 3 |
Annual Increase in Searchable Records | 2.2 Billion 3 |
FamilySearch Family Tree Population | 1.8 Billion Individuals 3 |
Annual Tree Growth | 163 Million People 3 |
Global FamilySearch Centers | 6,447 Locations 3 |
Digital Library Publications | 655,000+ Free Historical Books 3 |
Participating Countries and Territories | 235 3 |
Monthly Digitization Rate | 35 Million New Images 4 |
The scale of this operation is further illuminated by the diversity of the records acquired. In 2025 alone, FamilySearch targeted high-volume record sets from Italy, the Philippines, Brazil, and France, reflecting a strategic move to fill "data gaps" in non-English speaking regions.3 The acquisition of these records involves complex negotiations with national governments, local dioceses, and municipal archives, often resulting in the digitizing of records that have remained unindexed for centuries.
Sample of Specific Record Expansions (October 2025) | Record Count | Collection ID |
Italy, Bari Civil Registration (State Archive) | 4,180,895 | 1968511 5 |
Ireland, Dog License Registrations | 2,743,943 | 5000212 6 |
Brazil, Cemetery Records | 2,256,105 | 2137269 7 |
Guatemala, Civil Registration | 2,040,105 | 2075150 7 |
United States City and Business Directories | 44,844,493 | 3754697 5 |
Philippines, Church Census Records | 1,380,861 | 5000216 8 |
This granular data collection demonstrates a move toward "total archival capture." For instance, the inclusion of dog licenses in Ireland or cemetery records in Brazil suggests an attempt to triangulate individual existences through non-traditional vital records when standard civil registrations are missing or destroyed.6
The Physical Bastion: Granite Mountain Records Vault
The physical security of this data is managed through the Granite Mountain Records Vault (GMRV), a facility that represents the ultimate intersection of archival science and Cold War defensive architecture. Located in Little Cottonwood Canyon near Salt Lake City, the vault is excavated 600 feet into a mountain composed primarily of quartz monzonite—a dense, igneous rock with properties nearly identical to granite.9
The facility was designed to withstand a nuclear blast, featuring Mosler doors weighing up to 14 tons and internal climate controls that maintain the precise temperature and humidity required for long-term microfilm storage.1 This "fortress of history" currently houses over 2.4 million rolls of microfilm and 1 million microfiche, the equivalent of roughly 3 billion pages of genealogical data.10 While the Church is aggressively digitizing these holdings to make them available via FamilySearch.org, the physical masters remain in the vault as a redundant "hard copy" of human civilization.11
The Financial Pivot: Blackstone Inc. and the Commodification of Ancestry
While the LDS Church operates from a non-profit, theological mandate, the commercial side of the industry has undergone a radical financialization. In December 2020, Blackstone Inc., the world's largest alternative asset manager, completed the acquisition of Ancestry.com for $4.7 billion.13 This transaction signaled a shift in how genealogical data is perceived by capital markets—not as a niche hobbyist service, but as a high-value tech platform centered on proprietary data and recurring subscription revenue.
Strategic Logic of the Blackstone Acquisition
Blackstone’s entry into the market was driven by Ancestry’s transition into a technology-first enterprise. With over $1 billion in annual revenue and approximately 3.6 million paying subscribers by 2022, Ancestry represents a premier "data moat".14 The acquisition was structured as an all-stock transaction, which later became a pivotal point in legal defenses regarding data disclosure.16
Evolution of Ancestry Ownership | Period | Lead Investors/Owners |
Early Expansion (Infobases) | 1990–1997 | Paul Allen, Dan Taggart (BYU Grads) 13 |
Corporate Transition | 1997–2012 | Various Public/Private Equity 14 |
European Expansion | 2012–2016 | Permira (London-based) 13 |
Tech Integration | 2016–2020 | Silver Lake, GIC (Singapore) 13 |
Institutional Asset Phase | 2020–Present | Blackstone Inc. 13 |
By 2025, Blackstone’s investment strategy appeared to have reached a phase of maturation, with reports emerging that the firm was considering an IPO or a sale of Ancestry at a valuation of approximately $10 billion.15 This potential doubling of value in five years highlights the success of the subscription model in an era where DNA testing growth has slowed, but historical record access remains a "sticky" consumer product.19
The LDS-Ancestry Relationship: A Symbiotic Nexus
The relationship between the LDS Church and Ancestry is one of the most misunderstood aspects of the industry. While frequently confused by the public, Ancestry has never been owned by the LDS Church.13 However, the two entities share a deep operational symbiosis. Ancestry was founded by graduates of Brigham Young University (an LDS-owned institution), is headquartered in Utah, and relies heavily on data-sharing agreements with FamilySearch.13
This symbiosis is formalized through strategic partnerships, most notably a series of agreements signed in 2013 and 2014.4 Under these terms, FamilySearch provides raw digital images from its vault to commercial partners. In exchange, the partners—such as Ancestry—invest millions of dollars (Ancestry committed $60 million over five years in one such deal) to index these records.4 This "indexing-for-access" model allows FamilySearch to make records searchable decades faster than its volunteer force could alone, while providing Ancestry with exclusive "restricted period" windows where the searchable index is only available to its paying subscribers.4
AI and the Resolution of Genealogical Discrepancies
The integration of Artificial Intelligence (AI) represents the third major pillar of modern genealogical research. AI is no longer a peripheral tool for search optimization but is now the primary engine for resolving historical discrepancies and validating connections that have baffled human researchers for generations.
Handwriting Recognition and Archival Extraction
A primary bottleneck in genealogy has historically been the "unsearchable image"—digital photos of handwritten records that require manual indexing. FamilySearch’s release of "Full-Text Search" technology, powered by sophisticated handwriting recognition (HTR) models, has fundamentally altered this dynamic.3 These models use deep learning to interpret 17th and 18th-century script, transforming static images into searchable text. This allows AI to bypass the human transcriber, who is often prone to "read-errors" due to fatigue or unfamiliarity with archaic dialects.
The technical mechanism for this extraction relies on transformer-based architectures that assign probabilities to character sequences. For example, when an AI encounters a faded record from the 1830s, it does not merely "read" the name; it calculates the likelihood of various character combinations based on contemporary naming conventions and local census data.
Probabilistic Identity and the "Percent of Certainty" Factor
The concept of "probabilistic identity" is perhaps the most significant contribution of AI to the field. Because historical records are often fragmented, inconsistent, or intentionally falsified, AI can provide a "percent of certainty" for a given connection or biographical fact. This is not a guess, but a mathematical output derived from token probabilities and log-likelihoods.
In a technical sense, LLMs (Large Language Models) used in genealogy assign a "confidence score" to their outputs. This is calculated by aggregating the log probabilities of the tokens that constitute a biographical claim. If an AI claims that "John Smith of London" in a marriage record is the same as "John Smith of New York" in a later census, it calculates the joint probability of this match based on variables such as birth date consistency, spouse name alignment, and migration patterns.21

This "percent of certainty" factor is invaluable for "cementing" an individual's existence. For example, if multiple records point to the same birth date but different birth locations, the AI can weight the "probative value" of each record—perhaps prioritizing a birth certificate over a self-reported census entry—to arrive at a final certainty score of 85% for a specific biographical profile.21 Ancestry already employs similar probabilistic models in its DNA admixture reports, allowing users to adjust "confidence levels" from 50% (speculative) to 90% (conservative).23
Public Revelations and the "Scrubbing" Controversy
Both the LDS Church and Ancestry have made public statements regarding the use of AI to "scrub and cleanse" their databases, though their motivations and transparency levels differ.
FamilySearch: The Transparency of Data Quality
FamilySearch has been relatively public about its AI initiatives through its "FamilySearch Labs" and annual reviews.3 One of their primary AI-driven tools is the "Data Quality Score," which analyzes ancestor profiles to identify conflicts—such as a child born before the mother's birth or a death date preceding a marriage date.3 The system uses AI to flag these discrepancies and encourage users to resolve them, thereby "cleansing" the global family tree.
Furthermore, FamilySearch has implemented a "Merge Experience" tool that uses AI to guide users through the process of combining duplicate records, a chronic problem in the 1.8-billion-person tree.3 By surfacing "Possible Duplicates" and calculating the likelihood of a match, the AI acts as a gatekeeper against the creation of "phantom ancestors".25
Ancestry and Blackstone: The Privacy and Profit Filter
Ancestry’s public revelations concerning AI have focused more on "narrative generation" and "hint optimization." Features like "Storymaker Studio" use AI to draft biographical sketches from tree facts, while improved "hints" use AI to suggest new records to users with higher precision.19
However, the "scrubbing" mentioned in the context of Blackstone is often linked to Personally Identifiable Information (PII) and compliance with global privacy laws such as the General Data Protection Regulation (GDPR) and the Illinois Genetic Information Privacy Act (GIPA). Blackstone has publicly committed to protecting genetic privacy, explicitly stating it will not access user DNA data for non-genealogical purposes.19 This "PII scrubbing" is an essential defensive AI application, using Named Entity Recognition (NER) to identify and redact information regarding living persons from digital records before they are made public.27
The Economy of Accuracy: Incentives for Disclosure and Secrecy
The question of whether these organizations have incentives to hide or reveal data "cleansing" results is central to understanding the industry's future.
Incentives for Secrecy and Profit Preservation
For a commercial entity like Ancestry, owned by a private equity firm, the total volume of "records" is a primary marketing metric. If an AI cleansing process identifies that 15% of the "30 billion records" are actually low-quality duplicates or erroneous entries, the company has a short-term incentive to delay the "purging" of such data to maintain its perceived market dominance and subscriber value.29
Furthermore, revealing that a large percentage of connections in a user’s family tree are "speculative" (low certainty scores) could lead to "subscriber fatigue" and increased churn. In a subscription-driven model, there is a risk that high-transparency AI tools could devalue the years of research a user has already paid for by labeling it as "uncertain".30
Incentives for Disclosure and Theological Accuracy
Conversely, the LDS Church has a powerful, non-monetary incentive for total transparency and accuracy. In their theological framework, a "clean" and accurate record is not a luxury; it is a requirement for valid temple ordinances.1 Performing a baptism for a person who did not exist or for the wrong individual is seen as a failure of their religious duty. Therefore, FamilySearch is incentivized to develop and disclose the most rigorous AI cleansing tools available, even if it results in a temporary reduction in "total names" in the tree.
Additionally, as genealogical data is increasingly used for medical research and forensic genealogy, "certainty factors" become a matter of legal and scientific necessity. If Ancestry were to provide data for a study on hereditary diseases, the medical community would require the exact "percent of certainty" scores for the familial connections in the dataset.31
Incentive Alignment Analysis | Commercial (Blackstone/Ancestry) | Non-Profit (LDS/FamilySearch) |
Record Volume | High volume supports marketing and valuation.13 | Accuracy is prioritized over raw volume.33 |
User Sentiment | "Hints" must be exciting to maintain engagement.19 | Records must be "True and Accurate" for theology.1 |
Privacy Compliance | AI scrubbing is a legal requirement to avoid lawsuits.16 | AI scrubbing protects the "110-year rule" for living persons.25 |
Transparency | Proprietary algorithms are trade secrets.19 | Collaborative tools (Labs) encourage public testing.3 |
Institutional Data Governance and the "110-Year Rule"
One of the most rigid data governance policies in the field is the LDS "110-Year Rule," which prohibits the public disclosure or performance of temple ordinances for individuals born within the last 110 years without explicit permission from a close living relative.2 This rule acts as a "buffer" between historical records and modern privacy concerns.
AI plays a critical role in enforcing this rule across massive datasets. AI filters scan incoming records to identify birth dates and redact names of potential living persons. For instance, if a 1950 census is digitized, the AI must not only read the names but calculate the likelihood that the individuals listed are still alive, based on actuarial tables. This "PII scrubbing" is a form of proactive cleansing that satisfies both the religious policies of the Church and the legal requirements of the secular world.27
The Convergence of Archival Science and Machine Learning
The future of genealogy is trending toward a "unified field theory" of identity, where AI resolves the billions of disparate data points into a single, high-certainty global pedigree. The LDS Church provides the raw physical and theological fuel for this engine, while Blackstone provides the capital necessary to refine the technology.
As AI models become more adept at "knowledge unlearning" and "deduplication," we may see a period of "archival contraction," where the total number of records in global databases actually decreases as duplicates are purged and erroneous connections are severed.28 While this may present a challenge for the marketing departments of commercial genealogy firms, it represents the ultimate fulfillment of the archival mission: the creation of a definitive, accurate, and scientifically valid record of human existence.
The "percent of certainty" factor will likely become the standard metric by which all genealogical research is judged. Rather than a binary "found" or "not found," future researchers will operate in a probabilistic spectrum, where every ancestral connection is weighted by the aggregate intelligence of the global repository. In this new era, the "individual's existence" is no longer just a name on a page, but a high-probability node in a vast, institutional, and digital web of human history.
Technical Appendix: AI Reliability and Trust in Genealogical Systems
The integration of AI into genealogical research is not solely a matter of computational power but also one of "trust architecture." Research into the psychology of AI usage suggests that "familiarity" and "frequency of use" are the strongest predictors of trust in AI outcomes.38
Determinants of Trust in AI-Derived Genealogical Data | Standardized Estimate (β) | Significance (p-value) |
Familiarity with AI Technology | 0.28 | 0.0005 38 |
Frequency of AI Use | 0.22 | 0.016 38 |
Confidence in Complex Decisions | 0.34 | < 0.001 38 |
Confidence in Memory/Recall Tasks | 0.21 | 0.003 38 |
This data suggests that as genealogists interact more frequently with AI "hints" and "certainty scores," the industry will reach a tipping point where AI-derived conclusions are treated with the same weight as original primary sources. However, the "domain expert" consensus remains that AI should be treated as a "lead" rather than a "conclusion," requiring human verification for the most critical historical nodes.22 The balance between AI-driven speed and human-verified accuracy remains the defining tension of the 2025-2030 genealogical era.
Works cited
- Case Study Three: The Granite Mountain Record Vault - Benjamin Peters, accessed March 30, 2026, https://benjaminpeters.org/case-study-three-the-granite-mountain-record-vault/
- Church Policies - FamilySearch, accessed March 30, 2026, https://www.familysearch.org/en/help/helpcenter/church-policies
- FamilySearch Year in Review 2025, accessed March 30, 2026, https://www.familysearch.org/en/blog/familysearch-year-in-review-2025
- FamilySearch Partnerships: Some Questions and Answers, accessed March 30, 2026, https://www.familysearch.org/en/blog/familysearch-partnerships-some-questions-and-answers
- New Historical Records May 2025 - FamilySearch, accessed March 30, 2026, https://www.familysearch.org/en/blog/new-records-may-2025
- New Free Historical Records from 30 Countries | February 2025 Update - FamilySearch, accessed March 30, 2026, https://www.familysearch.org/en/blog/new-records-february-2025
- New Historical Records March 2025 - FamilySearch, accessed March 30, 2026, https://www.familysearch.org/en/blog/new-records-march-2025
- New Free Historical Records from 7 Countries | October 2025 Update - FamilySearch, accessed March 30, 2026, https://www.familysearch.org/en/blog/new-records-october-2025
- The World's Most Secure Buildings: The Granite Mountain Records Vault in Salt Lake City, Utah - Hirsch, accessed March 30, 2026, https://www.hirschsecure.com/us/en-us/blog/the-worlds-most-secure-buildings-the-granite-mountain-records-vault-in-salt-lake-city-utah
- Granite Mountain (Salt Lake County, Utah) - Wikipedia, accessed March 30, 2026, https://en.wikipedia.org/wiki/Granite_Mountain_(Salt_Lake_County,_Utah)
- Granite Mountain Records Vault - Intermountain Histories, accessed March 30, 2026, https://www.intermountainhistories.org/items/show/422
- Granite Mountain Records Vault - Church Newsroom, accessed March 30, 2026, https://news-bb.churchofjesuschrist.org/article/granite-mountain-records-vault
- Who Owns Ancestry.com? What You Need to Know, accessed March 30, 2026, https://www.genealogyexplained.com/who-owns-ancestry/
- Ancestry.com - Wikipedia, accessed March 30, 2026, https://en.wikipedia.org/wiki/Ancestry.com
- Blackstone explores $10bn exit options for Ancestry.com through IPO or sale, accessed March 30, 2026, https://pe-insights.com/blackstone-explores-10bn-exit-options-for-ancestry-com-through-ipo-or-sale/
- For the Seventh Circuit - United States Court of Appeals, accessed March 30, 2026, https://media.ca7.uscourts.gov/cgi-bin/OpinionsWeb/processWebInputExternal.pl?Submit=Display&Path=Y2023/D05-01/C:22-2486:J:Scudder:aut:T:fnOp:N:3038250:S:0
- Seventh Circuit Says All-Stock Acquisition — Without More — Does Not… - Fenwick, accessed March 30, 2026, https://www.fenwick.com/insights/publications/seventh-circuit-says-all-stock-acquisition-without-more-does-not-trigger-liability-under-illinois-genetic-information-privacy-act
- Who Owns Ancestry.com? The $4.7 Billion Truth - YouTube, accessed March 30, 2026, https://www.youtube.com/watch?v=gHRDXUjljz8
- Blackstone's Plans for Ancestry.com: What Could It Mean for ..., accessed March 30, 2026, https://genealogybargains.com/blackstones-plans-for-ancestry/
- Ancestry.com and FamilySearch to Make a Billion Global Records Available Online, accessed March 30, 2026, https://www.familysearch.org/en/blog/ancestry-com-and-familysearch-to-make-a-billion-global-records-available-online
- Confidence Unlocked: A Method to Measure Certainty in LLM Outputs - Medium, accessed March 30, 2026, https://medium.com/@vatvenger/confidence-unlocked-a-method-to-measure-certainty-in-llm-outputs-1d921a4ca43c
- The Right Way to Use AI in Genealogy Research, accessed March 30, 2026, https://ancestralfindings.com/right-way-to-use-ai-in-genealogy/
- DNA Testing Companies & Ethnicity Estimates Part I: The Ancestry Composition Tools of 23andMe - Legacy Tree Genealogists, accessed March 30, 2026, https://www.legacytree.com/blog/ancestry-composition-tools-23andme
- Are Your Irish Roots Showing? Understanding Your Ancestry DNA Ethnicity Results, accessed March 30, 2026, https://familylocket.com/are-your-irish-roots-showing-understanding-your-ancestry-dna-ethnicity-results/
- Starting Family Tree: Submitting Names for Temple Ordinances - FamilySearch, accessed March 30, 2026, https://www.familysearch.org/wiki/en/img_auth.php/4/40/FT-04_Starting_Family_Tree_-_Submitting_Names_for_Temple_Ordinances_handout_Approved_Nov_2018_REG.pdf
- What do the different ordinance statuses mean? - FamilySearch, accessed March 30, 2026, https://www.familysearch.org/en/help/helpcenter/article/what-do-the-different-ordinance-statuses-mean
- FamilySearch Data Processing and Transfer Terms Addendum, accessed March 30, 2026, https://www.familysearch.org/en/legal/data-processing
- SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and Mitigation - arXiv, accessed March 30, 2026, https://arxiv.org/html/2506.12699v2
- Genealogy rip-off alert: Ancestry jacks up monthly subscriptions up to - EasyGenie, accessed March 30, 2026, https://easygenie.org/blogs/news/ancestry-jacks-up-monthly-subscription-prices-up-to-25
- When AI Gets a Good Telling Off | Family and House History Research Hampshire, accessed March 30, 2026, https://www.timefliesancestry.co.uk/blog/2026/02/23/when-ai-gets-a-good-telling-off/
- AI-derived research domain criteria scores from medical records predict brain inflammatory markers in psychotic disorders: A cross-sectional, real-world study - PMC, accessed March 30, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC12847681/
- The role of artificial intelligence for the application of integrating electronic health records and patient-generated data in clinical decision support - ResearchGate, accessed March 30, 2026, https://www.researchgate.net/publication/381118303_The_role_of_artificial_intelligence_for_the_application_of_integrating_electronic_health_records_and_patient-generated_data_in_clinical_decision_support
- Genealogical Standards and Guidelines - International Institute - FamilySearch, accessed March 30, 2026, https://www.familysearch.org/en/wiki/Genealogical_Standards_and_Guidelines_-_International_Institute
- Class Action Claims Ancestry.com Violated Genetic Privacy Law by Disclosing Data in Blackstone Acquisition, accessed March 30, 2026, https://www.classaction.org/news/class-action-claims-ancestry.com-violated-genetic-privacy-law-by-disclosing-data-in-blackstone-acquisition
- How do I request ordinances for an ancestor who was born in the last 110 years?, accessed March 30, 2026, https://www.familysearch.org/en/help/helpcenter/article/how-do-i-request-ordinances-for-an-ancestor-who-was-born-in-the-last-110-years
- Compiled Public Comments on the Request for Information on Responsibly Developing and Sharing Generative Artificial Intelligence Tools Using NIH Controlled Access Data, accessed March 30, 2026, https://osp.od.nih.gov/wp-content/uploads/2025/09/Compiled_Public_Comments_RFI_on_Responsibly_Developing_and_Sharing_GenAI_Tools_Using_NIH_Controlled_Access_Data.pdf
- From Data to Evaluation: Investigating the Limits of the Individual‑Control Model in AI Governance - eScholarship.org, accessed March 30, 2026, https://escholarship.org/uc/item/7135b09g
- Demographic influences on trust in artificial intelligence across cognitive domains: A statistical perspective - PMC, accessed March 30, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC12588509/
