The Nonlinear Library

The Nonlinear Fund
undefined
Feb 21, 2024 • 7min

AF - Weak vs Quantitative Extinction-level Goodhart's Law by Vojtech Kovarik

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Weak vs Quantitative Extinction-level Goodhart's Law, published by Vojtech Kovarik on February 21, 2024 on The AI Alignment Forum. tl;dr: With claims such as "optimisation towards a misspecified goal will cause human extinction", we should be more explicit about the order of quantifiers (and the quantities) of the underlying concepts. For example, do we mean that that for every misspecified goal, there exists a dangerous amount of optimisation power? Or that there exists an amount of optimisation power that is dangerous for every misspecified goal? (Also, how much optimisation? And how misspecified goals?) Central to the worries of about AI risk is the intuition that if we even slightly misspecify our preferences when giving them as input to a powerful optimiser, the result will be human extinction. We refer to this conjecture as Extinction-level Goodhart's Law[1]. Weak version of Extinction-level Goodhart's Law To make Extinction-level Goodhart's Law slightly more specific, consider the following definition: Definition 1: The Weak Version of Extinction-level Goodhart's Law is the claim that: "Virtually any[2] goal specification, pursued to the extreme, will result in the extinction[3] of humanity."[4] Here, the "weak version" qualifier refers to two aspects of the definition. The first is the limit nature of the claim --- that is, the fact that the law only makes claims about what happens when the goal specification is pursued to the extreme. The second is best understood by contrasting Definition 1 with the following claim: Definition 2: The Uniform Version of Extinction-level Goodhart's Law is the claim that: "Beyond a certain level of optimisation power, pursuing virtually any goal specification will result in the extinction of humanity." In other words, the difference between Definitions 1 and 2 is the difference between (goal G s.t. [conditions]) (opt. power O) : Optimise(G, O) extinction (opt. power O) (goal G s.t. [conditions]) : Optimise(G, O) extinction. Quantitative version of Goodhart's Law While the distinction[5] between the weak and uniform versions of Extinction-level Goodhart's Law is important, it is too coarse to be useful for questions such as: For a given goal specification, what are the thresholds at which (i) additional optimisation becomes undesirable, (ii) we would have been better off not optimising at all, and (iii) the optimisation results in human extinction? Should we expect warning shots? Does iterative goal specification work? That is, can we keep optimising until things start to get worse, then spend a bit more effort on specifying the goal, and repeat? This highlights the importance of finding the correct quantitative version of the law: Definition 3: A Quantitative Version of Goodhart's Law is any claim that describes the relationship between a goal specification, optimisation power used to pursue it, and the (un)desirability of the resulting outcome. We could also look specifically at the quantitative version of extinction-level Goodhart's law, which would relate the quality of the goal specification to the amount of optimisation power that can be used before pursuing this goal specification results in extinction. Implications for AI-risk discussions Finally, note that the weak version of Extinction-level Goodhart's Law is consistent with AI not posing any actual danger of human extinction. This could happen for several reasons, including (i) because the dangerous amounts optimisation power are unreachable in practice, (ii) because we adopt goals which are not amenable to using unbounded (or any) amounts of optimisation[6], or (iii) because iterative goal specification is in practice sufficient for avoiding catastrophes. As a result, we view the weak version of Extinction-level Goodhart's Law as a conservative claim, which the prop...
undefined
Feb 21, 2024 • 7min

EA - Let's advertise EA infrastructure projects, Feb 2024 by Arepo

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Let's advertise EA infrastructure projects, Feb 2024, published by Arepo on February 21, 2024 on The Effective Altruism Forum. This is the latest of a theoretically-three-monthly series of posts advertising EA infrastructure projects that struggle to get and maintain awareness (see original advertising post for more on the rationale). I italicise organisations added since the previous post was originally submitted and bold those edited in to the current one after posting, to make it easier for people to scan for new entries. Also, since funding seems to be very tight in the community at the moment, I've added a section to the end on 'Organisations urgently seeking donations'. I don't have any inclusion criteria atm beyond signal boosting posts people have made in the last few months, especially those who are in an existential crisis from lack of funding. Coworking/socialising EA Gather Town - An always-on virtual meeting place for coworking, connecting, and having both casual and impactful conversations EA Anywhere - An online EA community for everyone EA coworking Discord - A Discord server dedicated to online coworking Free or subsidised accommodation CEEALAR/formerly the EA hotel - Provides free or subsidised serviced accommodation and board, and a moderate stipend for other living expenses. NonLinear's EA house database - An experiment by Nonlinear to try to connect EAs with extra space with EAs who could do good work if they didn't have to pay rent (or could pay less rent). Professional services Amber Dawn - a freelance writer and editor for the EA community who can help you edit drafts and write up your unwritten ideas. WorkStream Business Systems - a service dedicated to EAs, helping you improve your workflow, boost your bottom line and take control of your business cFactual - a new, EA-aligned strategy consultancy with the purpose of maximising its counterfactual impact Good Governance Project - helps EA organizations create strong boards by finding qualified and diverse professionals Altruistic Agency - provides discounted tech support and development to organisations Tech support from Soof Golan Legal advice from Tyrone Barugh - a practice under consideration with the primary aim of providing legal support to EA orgs and individual EAs, with that practice probably being based in the UK. SEADS - Data Science services to EA organizations User-Friendly - an EA-aligned marketing agency Anti Entropy - offers services related operations for EA organizations Arb - Our consulting work spans forecasting, machine learning, and epidemiology. We do original research, evidence reviews, and large-scale data pipelines. Pineapple Operations - Maintains a public database of people who are seeking operations or Personal Assistant/Executive Assistant work (part- or full-time) within the next 6 months in the Effective Altruism ecosystem Coaching Elliot Billingsley - Coaching is best for people who have personal or professional goals they're serious about accomplishing. My sessions are designed to improve clarity and motivation. Tee Barnett Coaching (coaching training) - a multi-component training infrastructure for developing your own practice as a skilled coach. (coach matchmaking) - Access matchmaking to high-quality coaching at below-market pricing Probably Good - Whether you're a student searching for the right path or an experienced professional seeking a purpose-driven opportunity, we're here to help you brainstorm career paths, evaluate options, and plan next steps AI Safety Support - health coaching to people working on AI safety (first session free) 80,0000 Hours career coaching - Speak with us for free about using your career to help solve one of the world's most pressing problems Yonatan Cale - Coaching for software devs FAANG style mock interviews - senior software...
undefined
Feb 21, 2024 • 48min

EA - The Case for Animal-Inclusive Longtermism by BrownHairedEevee

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Case for Animal-Inclusive Longtermism, published by BrownHairedEevee on February 21, 2024 on The Effective Altruism Forum. In: Journal of Moral Philosophy Author: Gary David O'Brien Online Publication Date: 19 Jan 2024 License: Creative Commons Attribution 4.0 International Abstract Longtermism is the view that positively influencing the long-term future is one of the key moral priorities of our time. Longtermists generally focus on humans, and neglect animals. This is a mistake. In this paper I will show that the basic argument for longtermism applies to animals at least as well as it does to humans, and that the reasons longtermists have given for ignoring animals do not withstand scrutiny. Because of their numbers, their capacity for suffering, and our ability to influence their futures, animals ought to be a central concern of longtermists. Furthermore, I will suggest that longtermism is a fruitful framework for thinking about the wellbeing of animals, as it helps us to identify actions we can take now that have a reasonable chance of improving the wellbeing of animals over the very long term. Keywords: longtermism; animal ethics; wild animal suffering Introduction Longtermism is the view that positively influencing the long-term future is one of the key moral priorities of our time.1 Since the future has the potential to be truly vast, both in duration and the number of individuals who will ever live, it is plausible that the long-term future might be extremely valuable, or extremely disvaluable. If we care about impartially doing good, then we should be especially concerned to ensure that the long-term future goes well, assuming that it is within our power to do so. Most longtermists focus on humans, and largely ignore animals. This is a mistake. In this paper I will show that the basic argument for longtermism applies to animals at least as well as it does to humans, and that the reasons longtermists have given for ignoring animals do not stand up to scrutiny. I will argue that, because of their numbers, their capacity for suffering, and our ability to influence their futures, animals ought to be a central concern of longtermists. Furthermore, I will suggest that longtermism is a fruitful framework for thinking about the wellbeing of animals, as it helps us to identify effective actions that we can take in the near future that have a reasonable chance of improving the wellbeing of animals over the very long term. In Section 1 I will lay out the basic argument for longtermism and consider some of the reasons why longtermists have neglected animals. In Sections 2 and 3 I will show that the basic argument for longtermism applies to animals and that we can use the longtermist framework to identify interventions that have a reasonable chance of making the long-term future go better for animals. More specifically, I will argue that (1) now or in the near-term future humans can act in ways that will predictably increase or decrease the scale and duration of wild animal suffering in the long term and (2) we are in an especially influential time for locking in values that can be expected to be good or bad for domesticated animals in the long term. Finally in Section 4 I will suggest some longtermist interventions for animals that might be more effective than short-term alternatives and will suggest areas for further research. For simplicity, I will assume a hedonistic theory of animal wellbeing, though nothing I say will be incompatible with the view that there are also important non-hedonic elements related to animal wellbeing. I will assume that all vertebrates have the capacity for sentience, and hence for positive and negative welfare. Although I will not have space to argue for this, I will assume the increasingly accepted view that the majority of animals in ...
undefined
Feb 21, 2024 • 6min

LW - Why does generalization work? by Martín Soto

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why does generalization work?, published by Martín Soto on February 21, 2024 on LessWrong. Just an interesting philosophical argument I. Physics Why can an ML model learn from part of a distribution or data set, and generalize to the rest of it? Why can I learn some useful heuristics or principles in a particular context, and later apply them in other areas of my life? The answer is obvious: because there are some underlying regularities between the parts I train on and the ones I test on. In the ML example, generalization won't work when approximating a function which is a completely random jumble of points. Also, quantitatively, the more regular the function is, the better generalization will work. For example, polynomials of lower degree require less data points to pin down. Same goes for periodic functions. Also, a function with lower Lipschitz constant will allow for better bounding of the values in un-observed points. So it must be that the variables we track (the ones we try to predict or control, either with data science or our actions), are given by disproportionately regular functions (relative to random ones). In this paper by Tegmark, the authors argue exactly that most macroscopic variables of interest have Hamiltonians of low polynomial degree. And that this happens because of some underlying principles of low-level physics, like locality, symmetry, or the hierarchical composition of physical processes. But then, why is low-level physics like that? II. Anthropics If our low-level physics wasn't conducive to creating macroscopic patterns and regularities, then complex systems capable of asking that question (like ourselves) wouldn't exist. Indeed, we ourselves are nothing more than a specific kind of macroscopic pattern. So anthropics explains why we should expect such patterns to exist, similarly to how it explains why the gravitational constant, or the ratio between sound and light speed, are the right ones to allow for complex life. III. Dust But there's yet one more step. Let's try to imagine a universe which is not conducive to such macroscopic patterns. Say you show me its generating code (its laws of physics), and run it. To me, it looks like a completely random mess. I am not able to differentiate any structural regularities that could be akin to the law of ideal gases, or the construction of molecules or cells. While on the contrary, if you showed me the running code of this reality, I'd be able (certainly after many efforts) to differentiate these conserved quantities and recurring structures. What are, exactly, these macroscopic variables I'm able to track, like "pressure in a room", or "chemical energy in a cell"? Intuitively, they are a way to classify all possible physical arrangements into more coarse-grained buckets. In the language of statistical physics, we'd say they are a way to classify all possible microstates into a macrostate partition. For example, every possible numerical value for pressure is a different macrostate (a different bucket), that could be instantiated by many different microstates (exact positions of particles). But there's a circularity problem. When we say a certain macroscopic variable (like pressure) is easily derived from others (like temperature), or that it is a useful way to track another variable we care about (like "whether a human can survive in this room"), we're being circular. Given I already have access to a certain macrostate partition (temperature), or that I already care about tracking a certain macrostate partition (aliveness of human), then I can say it is natural or privileged to track another partition (pressure). But I cannot motivate the importance of pressure as a macroscopic variable from just looking at the microstates. Thus, "which parts of physics I consider interesting macroscopic varia...
undefined
Feb 21, 2024 • 4min

EA - Moral Trade Proposal with 95-100% Surplus by Pete Rowlett

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Moral Trade Proposal with 95-100% Surplus, published by Pete Rowlett on February 21, 2024 on The Effective Altruism Forum. Introduction This post is a continuation of my earlier " Modeling Moral Trade in Antibiotic Resistance and Alternative Proteins," " Generating More Surplus in Moral Trades," and " Developing Counterfactual Trust in Moral Trade." Here I'll make a proposal for a specific moral trade, and then I'll provide a resource that will hopefully facilitate more trades. I think moral trade is an underexplored topic with significant opportunity for gains. The lowest-hanging fruit seems to be the synergy between animal welfare groups and climate groups. Both accept alternative proteins as one of the best uses of marginal funding (GFI is a top-rated charity by both Animal Charity Evaluators and Giving Green). Proposal I am proposing trade between funds run by these groups. On one side, the Giving Green Fund, and on the other side, Animal Charity Evaluators Recommended Charity Fund. The Giving Green Fund has distributed funds to its top charities twice. The first time, this past summer, each organization received $50,000, and $50,000 was saved for later. The second time, at the end of 2023, the funds were not evenly distributed. They sent $100,000 to Industrious Labs, $70,000 to Good Food Institute, and $50,000 each to Good Energy Collective, Evergreen Collaborative, and Clean Air Task Force. The justifications, with very transparent reasoning, are here and here. So while the uneven distribution may make counterfactual trust harder to build, the clarity in the process should largely counteract that effect. The ACE fund has consistently distributed money to top charities and standout charities, including GFI, in consistent ratios. Recently they've switched to a binary recommended or not recommended status for charities, so it seems reasonable to assume that they would allocate money from the fund evenly between all of the recommended charities. Normally high cost-effectiveness ratings would harm counterfactual trust and discourage actors from engaging in moral trade, but in this case, both funds have a fairly strong track record of systematically allocating funding, so determining the counterfactual is relatively easy. I would advocate for both funds to redirect an additional $50,000 from their other funded nonprofits to GFI as a first moral trade. Both can contribute an equal amount because they have roughly equal relative cost-effectiveness estimates. I estimate that a 95 to 100% surplus should be generated from this trade (i.e. both worldviews will get 95 to 100% more moral value from those $50,000 than they would have gotten had they simply donated to the alternative top charity without trade). You can see my calculations here. It may make sense to make the reallocation smaller if GFI will have difficulty absorbing the marginal funding at similar levels of cost-effectiveness (though I doubt this will be the case, since they have an 8-figure budget). Another consideration is whether the other top nonprofits were relying on an expected donation - it's important to avoid messing up their plans. Terms could also be negotiated based on up-to-date cost-effectiveness estimates of both GFI and the alternate top charities. For example, Giving Green may find some of ACE's other recommended charities to be somewhat effective, making the trade less valuable for them, and meaning that they donate less. The value generated here will come from both the moral trade itself, and from the information value of attempting to conduct a moral trade. A writeup about the execution may encourage others to take similar actions. Giving Green did a quick review and was fine with my posting this, but did not have time to review it in detail before the day I scheduled to post and has not en...
undefined
Feb 21, 2024 • 2min

LW - Less Wrong automated systems are inadvertently Censoring me by Roko

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Less Wrong automated systems are inadvertently Censoring me, published by Roko on February 21, 2024 on LessWrong. Just a short post to highlight an issue with debate on LW; I have recently been involved with some interest in the debate on covid-19 origins on here. User viking_math posted a response which I was keen to respond to, but it is not possible for me to respond to that debate (or any) because the LW site has rate-limited me to one comment per 24 hours because my recent comments are on -5 karma or less. So, I feel that I should highlight that one side of the debate (my side) is simply not going to be here. I can't prosecute a debate like this. This is funnily enough an example of brute-force manufactured consensus - there will be a debate, people will make points on their side and the side I am arguing for will be missing, so observers will conclude that there are no valid counterarguments rather than that there are, but they were censored. I think this is actually quite a good model of how the world has reached the wrong conclusion about various things (which may include covid-19 origins, assuming that covid-19 was actually a lab leak which is not certain). This is perhaps even more interesting than whether covid-19 came from a lab or not - we already knew before 2019 that bioerror was a serious risk. But I feel that we underestimate just how powerful multiple synergistic brute-force consensus mechanisms are at generating an information cascade into the incorrect conclusion. I'm sure these automated systems were constructed with good intentions, but they do constitute a type of information cascade mechanism - people choose to downvote, so you cannot reply, so it looks like you have no arguments, so people choose to downvote more, etc. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Feb 21, 2024 • 21min

EA - Meta EA Regional Organizations (MEAROs): An Introduction by Rockwell

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Meta EA Regional Organizations (MEAROs): An Introduction, published by Rockwell on February 21, 2024 on The Effective Altruism Forum. Thank you to the many MEARO leaders who provided feedback and inspiration for this post, directly and through their work in the space. Introduction In the following post, I will introduce MEAROs - Meta EA Regional Organizations - a new term for a long-established segment of the EA ecosystem. I will provide an overview of the roles MEAROs currently serve and a sketch of what MEAROs could look like and could accomplish if more fully resourced. Though MEAROs have existed as long as EA itself, I think the concept has been underdefined, underexplored, and consequently underutilized as a tool for solving the world's most pressing problems. I'm hopeful that giving MEAROs a name will help the community at large better understand these organizations and prompt wider discussion on how to strategically develop them over time. Background By way of background, I have worked full-time on a MEARO - Effective Altruism New York City - since August 2021. In my role with EA NYC, I consider my closest collaborators not only the direct EA NYC team but also the leaders of other MEAROs, especially those who likewise receive a portion of their organization funding through Centre for Effective Altruism's Community Building Grants (CBG) Program. As I previously stated on the Forum, before the FTX collapse, there was a heavy emphasis on making community building a long-term and sustainable career path.[1] As a result, there are now dozens of people working professionally and often full-time on MEAROs. This is a notable and very recent shift: Many MEAROs were founded shortly after EA was named, or morphed out of communities that predated EA. Most MEAROs were volunteer-run for the majority of their existence. CEA launched the CBG Program in 2018 and slowly expanded its scope through 2022. EA NYC, for example, was volunteer-run for over seven years before receiving funding for two full-time employees through the CBG Program in Summer 2020. This has led to a game of catch-up: MEAROs have professionalized, but many in the broader EA community still think of MEAROs as volunteer-operated clubs, rather than serious young nonprofits. We also now have significantly more brainpower thinking about ways to maximize impact through the MEARO structure,[2] a topic I do not feel has been adequately explored on the Forum. (I recommend Jan Kulveit's posts from October 2018 - Why develop national-level effective altruism organizations? and Suggestions for developing national-level effective altruism organizations - for among the most relevant early discourse I'm aware of on the Forum.) I hope this post can not only give the broader EA ecosystem a better sense of the roles MEAROs currently serve but also open discussion and get others thinking about how we can use MEAROs more effectively. Defining MEAROs MEAROs work to enhance and support the EA movement and its objectives within specific regions. This description is intentionally broad as MEAROs' work varies substantially between organizations and over time. My working definition of MEAROs requires the following characteristics: 1. Region-Specific Focus True to EA values, MEAROs maintain a global outlook and are committed to solving the world's most pressing problems, but do this by promoting and supporting the EA movement and its objectives within a particular geographical area. The region could be a city, state, country, or alternative geographical unit, and the MEARO's activities and initiatives are typically tailored to the context and needs of that region. 2. Focus on Meta-EA Meta Effective Altruism - the branch of the EA ecosystem MEAROs sit within - describes efforts to improve the efficiency, reach, and impact of the effect...
undefined
Feb 21, 2024 • 59min

LW - AI #51: Altman's Ambition by Zvi

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #51: Altman's Ambition, published by Zvi on February 21, 2024 on LessWrong. [Editor's note: I forgot to post this to WorldPress on Thursday. I'm posting it here now. Sorry about that.] Sam Altman is not playing around. He wants to build new chip factories in the decidedly unsafe and unfriendly UAE. He wants to build up the world's supply of energy so we can run those chips. What does he say these projects will cost? Oh, up to seven trillion dollars. Not a typo. Even scaling back the misunderstandings, this is what ambition looks like. It is not what safety looks like. It is not what OpenAI's non-profit mission looks like. It is not what it looks like to have concerns about a hardware overhang, and use that as a reason why one must build AGI soon before someone else does. The entire justification for OpenAI's strategy is invalidated by this move. I have spun off reactions to Gemini Ultra to their own post. Table of Contents Introduction. Table of Contents. Language Models Offer Mundane Utility. Can't go home? Declare victory. Language Models Don't Offer Mundane Utility. Is AlphaGeometry even AI? The Third Gemini. Its own post, link goes there. Reactions are mixed. GPT-4 Real This Time. Do you remember when ChatGPT got memory? Deepfaketown and Botpocalypse Soon. Bot versus bot, potential for AI hacking. They Took Our Jobs. The question is, will they also take the replacement jobs? Get Involved. A new database of surprising AI actions. Introducing. Several new competitors. Altman's Ambition. Does he actually seek seven trillion dollars? Yoto. You only train once. Good luck! I don't know why. Perhaps you'll die. In Other AI News. Andrej Karpathy leaves OpenAI, self-discover algorithm. Quiet Speculations. Does every country need their own AI model? The Quest for Sane Regulation. A standalone post on California's SR 1047. Washington D.C. Still Does Not Get It. No, we are not confused about this. Many People are Saying. New Yorkers do not care for AI, want regulations. China Watch. Not going great over there, one might say. Roon Watch. If you can. How to Get Ahead in Advertising. Anthropic super bowl ad. The Week in Audio. Sam Altman at the World Government Summit. Rhetorical Innovation. Several excellent new posts, and a protest. Please Speak Directly Into this Microphone. AI killer drones now? Aligning a Smarter Than Human Intelligence is Difficult. Oh Goody. Other People Are Not As Worried About AI Killing Everyone. Timothy Lee. The Lighter Side. So, what you're saying is… Language Models Offer Mundane Utility Washington D.C. government exploring using AI for mundane utility. Deliver your Pakistani presidential election victory speech while you are in prison. Terrance Tao suggests a possible application for AlphaGeometry. Help rescue your Fatorio save from incompatible mods written in Lua. Shira Ovide says you should use it to summarize documents, find the exact right word, get a head start on writing something difficult, dull or unfamiliar, or make cool images you imagine, but not to use it to get info about an image, define words, identify synonyms, get personalized recommendations or to give you a final text. Her position is mostly that this second set of uses is unreliable. Which is true, and you do not want to exclusively or non-skeptically rely on the outputs, but so what? Still seems highly useful. Language Models Don't Offer Mundane Utility AlphaGeometry is not about AI? It seems that what AlphaGeometry is mostly doing is combining DD+AR, essentially labeling everything you can label and hoping the solution pops out. The linked post claims that doing this without AI is good enough in 21 of the 25 problems that it solved, although a commentor notes the paper seems to claim it was somewhat less than that. If it was indeed 21, and to some extent even if it wasn't...
undefined
Feb 21, 2024 • 12min

EA - Farmed animal funding towards Africa is growing but remains highly neglected by AnimalAdvocacyAfrica

The podcast discusses the growth of funding for farmed animal advocacy in Africa, highlighting disparities and the need for increased resources. Major funders have significantly increased contributions since 2020, with a focus on neglected regions like Africa. South Africa receives the most funding due to its developed economy, emphasizing the need for more support across the continent.
undefined
Feb 20, 2024 • 6min

AF - Why does generalization work? by Martín Soto

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why does generalization work?, published by Martín Soto on February 20, 2024 on The AI Alignment Forum. Just an interesting philosophical argument I. Physics Why can an ML model learn from part of a distribution or data set, and generalize to the rest of it? Why can I learn some useful heuristics or principles in a particular context, and later apply them in other areas of my life? The answer is obvious: because there are some underlying regularities between the parts I train on and the ones I test on. In the ML example, generalization won't work when approximating a function which is a completely random jumble of points. Also, quantitatively, the more regular the function is, the better generalization will work. For example, polynomials of lower degree require less data points to pin down. Same goes for periodic functions. Also, a function with lower Lipschitz constant will allow for better bounding of the values in un-observed points. So it must be that the variables we track (the ones we try to predict or control, either with data science or our actions), are given by disproportionately regular functions (relative to random ones). In this paper by Tegmark, the authors argue exactly that most macroscopic variables of interest have Hamiltonians of low polynomial degree. And that this happens because of some underlying principles of low-level physics, like locality, symmetry, or the hierarchical composition of physical processes. But then, why is low-level physics like that? II. Anthropics If our low-level physics wasn't conducive to creating macroscopic patterns and regularities, then complex systems capable of asking that question (like ourselves) wouldn't exist. Indeed, we ourselves are nothing more than a specific kind of macroscopic pattern. So anthropics explains why we should expect such patterns to exist, similarly to how it explains why the gravitational constant, or the ratio between sound and light speed, are the right ones to allow for complex life. III. Dust But there's yet one more step. Let's try to imagine a universe which is not conducive to such macroscopic patterns. Say you show me its generating code (its laws of physics), and run it. To me, it looks like a completely random mess. I am not able to differentiate any structural regularities that could be akin to the law of ideal gases, or the construction of molecules or cells. While on the contrary, if you showed me the running code of this reality, I'd be able (certainly after many efforts) to differentiate these conserved quantities and recurring structures. What are, exactly, these macroscopic variables I'm able to track, like "pressure in a room", or "chemical energy in a cell"? Intuitively, they are a way to classify all possible physical arrangements into more coarse-grained buckets. In the language of statistical physics, we'd say they are a way to classify all possible microstates into a macrostate partition. For example, every possible numerical value for pressure is a different macrostate (a different bucket), that could be instantiated by many different microstates (exact positions of particles). But there's a circularity problem. When we say a certain macroscopic variable (like pressure) is easily derived from others (like temperature), or that it is a useful way to track another variable we care about (like "whether a human can survive in this room"), we're being circular. Given I already have access to a certain macrostate partition (temperature), or that I already care about tracking a certain macrostate partition (aliveness of human), then I can say it is natural or privileged to track another partition (pressure). But I cannot motivate the importance of pressure as a macroscopic variable from just looking at the microstates. Thus, "which parts of physics I consider interesting macr...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app