The Nonlinear Library

The Nonlinear Fund

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

Episodes

Mentioned books

Mar 15, 2024 • 6min

AF - Improving SAE's by Sqrt()-ing L1 & Removing Lowest Activating Features by Logan Riggs Smith

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Improving SAE's by Sqrt()-ing L1 & Removing Lowest Activating Features, published by Logan Riggs Smith on March 15, 2024 on The AI Alignment Forum. TL;DR We achieve better SAE performance by: Removing the lowest activating features Replacing the L1(feature_activations) penalty function with L1(sqrt(feature_activations)) with 'better' meaning: we can reconstruct the original LLM activations w/ lower MSE & with fewer features/datapoint. As a sneak peak (the graph should make more sense as we build up to it, don't worry!): Now in more details: Sparse Autoencoders (SAEs) reconstruct each datapoint in [layer 3's residual stream activations of Pythia-70M-deduped] using a certain amount of features (this is the L0-norm of the hidden activation in the SAE). Typically the higher activations are interpretable & the lowest of activations non-interpretable. Here is a feature that activates mostly on apostrophe (removing it also makes it worse at predicting "s"). The lower activations are conceptually similar, but then we have a huge amount of tokens that are something else. From a datapoint viewpoint, there's a similar story: given a specific datapoint, the top activation features make a lot of sense, but the lowest ones don't (ie if 20 features activate that reconstruct a specific datapoint, the top ~5 features make a decent amount of sense & the lower 15 make less and less sense) Are these low-activating features actually important for downstream performance (eg CE)? Or are they modeling noise in the underlying LLM (which is why we see conceptually similar datapoints in lower activation points)? Ablating Lowest Features There are a few different ways to remove the "lowest" feature activations. Dataset View: Lowest k-features per datapoint Feature View: Features have different activation values. Some are an OOM larger than others on average, so we can set feature specific thresholds. Percentage of max activation - remove all feature activations that are < [10%] of max activation for that feature Quantile - Remove all features in the [10th] percentile activations for each feature Global Threshold - Let's treat all features the same. Set all feature activations less than [0.1] to 0. It turns out that the simple global threshold performs the best: [Note: "CE" refers to the CE when you replace [layer 3 residual stream]'s activations with the reconstruction from the SAE. Ultimately we want the original model's CE with the smallest amount of feature's per datapoint (L0 norm).] You can halve the L0 w/ a small (~0.08) increase in CE. Sadly, there is an increase in both MSE & CE. If MSE was higher & CE stayed the same, then that supports the hypothesis that the SAE is modeling noise at lower activations (ie noise that's important for MSE/reconstruction but not for CE/downstream performance). But these lower activations are important for both MSE & CE similarly. For completion sake, here's a messy graph w/ all 4 methods: [Note: this was run on a different SAE than the other images] There may be a more sophisticated methods that take into account feature-information (such as whether it's an outlier feature or feature frequency), but we'll be sticking w/ the global threshold for the rest of the post. Sweeping Across SAE's with Different L0's You can get widly different L0's by just sweeping the weight on the L1 penalty term where increasing the L0 increases reconstruction but at the cost of more, potentially polysemantic, features per datapoint. Does the above phenomona extend to SAE's w/ different L0's? Looks like it does & the models seems to follow a pareto frontier. Using L1(sqrt(feature_activation)) @Lucia Quirke trained SAE's with L1(sqrt(feature_activations)) (this punishes smaller activations more & larger activations less) and anecdotally noticed less of these smaller, unintepreta...

Mar 15, 2024 • 57sec

EA - Unflattering aspects of Effective Altruism by NunoSempere

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Unflattering aspects of Effective Altruism, published by NunoSempere on March 15, 2024 on The Effective Altruism Forum. I've been writing a few posts critical of EA over at my blog. They might be of interest to people here: Unflattering aspects of Effective Altruism Alternative Visions of Effective Altruism Auftragstaktik Hurdles of using forecasting as a tool for making sense of AI progress Brief thoughts on CEA's stewardship of the EA Forum Why are we not harder, better, faster, stronger? ...and there are a few smaller pieces on my blog as well. I appreciate comments and perspectives anywhere, but prefer them over at the individual posts, since I disagree with the EA Forum's approach to life. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Mar 15, 2024 • 3min

EA - The Lack of EA in US Private Foundations by Kyle Smith

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Lack of EA in US Private Foundations, published by Kyle Smith on March 15, 2024 on The Effective Altruism Forum. I've written before about trying to bring US private foundations into EA as major funders. I got some helpful feedback and haven't really pursued it further. I study US private foundations as a researcher and recently conducted a qualitative data collection of staff at 20 very large US private foundations ($100m+ assets). The subject of the study isn't directly EA related (focused mostly on how they use accounting/effectiveness information and accountability), but it got me thinking a lot! Some interesting observations that I am going to explore further, in future forum posts (if y'all think it's interesting) and future research papers: Trust-based philanthropy (TBP), a funder movement that's only been around since 2020, has had a HUGE impact on very large private foundations. All 20 indicated that they had already/were in the process of integrating TBP into their grantmaking. I can't emphasize enough how influential TBP has been. (This is a major finding of our current paper that is being drafted). It was not a planned question, but I often asked if they/their foundation knew about EA and if it had influenced their giving. Some were slightly aware of EA, and primarily had a negative perception. None were convinced by EA ideas and none indicated their foundation had been influenced by EA. When pressed about their feelings about EA, many suggested that they viewed EA and TBP as being incompatible with one another (either you trust your grantees [TBP] or you evaluate them rigorously [EA]), and they were choosing TBP. Which was pretty interesting to me, as I don't think they are incompatible (this is probably a paper I am going to get going here soon). I think a place where there is a disconnect is that these PF basically think being EA-aligned means you have to be a major pain in the ass to your grantees. There certainly are major roadblocks to integrating EA into US private foundations. Their charters typically force them to concentrate their giving in specific cause areas/geographic areas, which by design are constraints not particularly compatible with EA. But I do still believe there is potential for progress to be made, even if it doesn't mean PF funds get to EA directly. No matter the type of constraints of a PF charter, within those constraints, EA principles can still cause them to increase their effectiveness. How can EA make inroads with these major funders, and does it potentially start with a model for effective giving that is willing to relax on necessary principles so as to allow for constraints? Some constraints are easier to relax than others: a constraint that says "we must focus on education" is easier than "we must focus on the state of Delaware". Anyway, all of this is on my mind and I am writing this to procrastinate from actually writing the paper that this all comes from! Anyone interested in this topic feel free to reach out. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Mar 15, 2024 • 5min

EA - Crowdsourced Overview of Funding for Regional Community Building Orgs by Rockwell

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Crowdsourced Overview of Funding for Regional Community Building Orgs, published by Rockwell on March 15, 2024 on The Effective Altruism Forum. This post was primarily authored by Kiryl Shantyka (EA Sweden), James Herbert (EA Netherlands), and Rocky Schwartz (EA NYC). They received initial feedback from many MEARO leaders and some initial feedback from Caleb Parikh (EA Funds) and Naomi Nederlof (CEA Community Building Grants Manager), though they don't necessarily endorse all of the points made in this post. All mistakes are the authors' own. Introduction and Request for Contribution Last month, we introduced a new term: Meta EA Regional Organisation (MEARO) and invited the broader EA community to participate in discussion on the value and evaluation of these organisations. This post focuses on funding for MEAROs. We share information on recent changes in MEARO funding and request contributions to funding data collection efforts. We think there is strong potential for a more structured MEARO funding strategy, backed by robust data. Our hope is to spark a conversation that ensures MEAROs' work is intentionally supported and their significance fully recognized. We think the EA community might want to answer the questions: What percentage of EA funding should go to meta work? What percentage of meta work funding should go to meta EA regional organisations (MEAROs)? To answer the above questions, among other considerations, we need to know: The current state of MEARO funding. How the MEARO funding landscape has changed in the past year. The results of our investigation are below. We gathered this information by speaking with funders (EA IF and CBG) and MEARO leaders. Next steps: We would like to see better data collection and tracking. Two ways to do this are by contributing to the MEARO Funding Map and The Centre for Exploratory Altruism Research's EA Meta Funding Survey. Map of Funding Data To provide a clearer and more nuanced overview of the MEARO funding situation, we've developed an interactive map. It's important to note that our current dataset does not fully capture all MEAROs' funding details. Read about our methodology in notes to the map. We divide existing MEAROs into the following classification categories: Stable Funding : Minor increase, adjustment, or no change in funding levels compared to the previous period that hasn't affected organisational capacity significantly. Adjusted Funding: A minor reduction in funding (reported 0-10% reduction of organisational capacity), leading to a slight decrease in operational capacity or FTEs. Reduced Funding: A noticeable reduction in funding (more than 10%-30% of organisational capacity), significantly impacting operational budgets and possibly leading to a moderate decrease in FTEs. Critical Funding Cut: A major reduction in funding (30-70% of organisational capacity), critically affecting operational budgets and leading to a significant decrease in FTEs. Drastic Funding Cut: A dramatic reduction in funding (70%+ of organisational capacity). Under Review: Organisations whose funding situation is currently being evaluated or will be reassessed in the near future, with potential for changes. For MEARO leaders, particularly those with active or adjusted full-time equivalents (FTE), your input is invaluable. By sharing your information through this form, you'll be contributing to a more complete and accurate overview. MEARO Funding Structure The historical structure of MEARO funding makes organisations particularly vulnerable to funding cuts because they usually are reliant on one major funder and do not have an independent financial runway. Most MEAROs receive 70 to 100% of their funding from institutional funders (e.g. Centre for Effective Altruism (CEA), EA Infrastructure Fund (EAIF)), typically on six- to twelve-mo...

Mar 15, 2024 • 8min

LW - Constructive Cauchy sequences vs. Dedekind cuts by jessicata

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Constructive Cauchy sequences vs. Dedekind cuts, published by jessicata on March 15, 2024 on LessWrong. In classical ZF and ZFC, there are two standard ways of defining reals: as Cauchy sequences and as Dedekind cuts. Classically, these are equivalent, but are inequivalent constructively. This makes a difference as to which real numbers are definable in constructive logic. Cauchy sequences and Dedekind cuts in classical ZF Classically, a Cauchy sequence is a sequence of reals x1,x2,…, such that for any ϵ>0, there is a natural N such that for any m,n>N, |xmxn| A Cauchy sequence lets us approximate the represented real to any positive degree of precision. If we want to approximate the real by a rational within ϵ, we find N corresponding to this ϵ and use xN+1 as the approximation. We are assured that this approximation must be within ϵ of any future xi in the sequence; therefore, the approximation error (that is, |xN+1limixi|) will not exceed ϵ. A Dedekind cut, on the other hand, is a partition of the rationals into two sets A,B such that: A and B are non-empty. For rationals x < y, if yA, then xA (A is downward closed). For xA, there is also yA with x It represents the real number supA. As with Cauchy sequences, we can approximate this number to within some arbitrary ϵ; we do this by doing a binary search to find rationals x Translating a Dedekind cut to a CSR is straightforward. We set the terms of the sequence to be successive binary search approximations of supA, each of which are rational. Since the binary search converges, the sequence is Cauchy. To translate a CSR to a Dedekind cut, we will want to set A to be the set of rational numbers strictly less than the sequence's limit; this is correct regardless if the limit is rational (check both cases). These constitute the set of rationals y for which there exists some rational ϵ>0 and some natural N, such that for every n > N, y+ϵ We're not worried about this translation being computable, since we're finding a classical logic definition. Since CSRs can be translated to Dedekind cuts representing the same real number and vice versa, these formulations are equivalent. Cauchy sequences and Dedekind cuts in constructive mathematics How do we translate these definitions to constructive mathematics? I'll use an informal type theory based on the calculus of constructions for these definitions; I believe they can be translated to popular theorem provers such as Coq, Agda, and Lean. Defining naturals, integers, and rationals constructively is straightforward. Let's first consider CSRs. These can be defined as a pair of values: s:NQ t:(ϵ:Q,ϵ>0)N Satisfying: (ϵ:Q,ϵ>0),(m:N,m>t(ϵ)),(n:N,n>t(ϵ)):|s(m)s(n)| Generally, type theories are computable, so s and t will be computable functions. What about Dedekind cuts? This consists of a quadruple of values a:QB b:Q c:Q d:(x:Q,a(x)=True)Q Where B is the Boolean type. A corresponds to the set of rationals for which a is true. The triple must satisfy: a(b)=True a(c)=False (x:Q,a(x)=True):d(x)>xa(d(x))=True (x,y:Q,x a specifies the sets A and B; b and c show that A...

Mar 14, 2024 • 8min

LW - Conditional on Getting to Trade, Your Trade Wasn't All That Great by Ricki Heicklen

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Conditional on Getting to Trade, Your Trade Wasn't All That Great, published by Ricki Heicklen on March 14, 2024 on LessWrong. "I refuse to join any club that would have me as a member" -Marx[1] 1. The Subway Seat: You're on a subway platform, and the train pulls into the station. Almost all of the subway cars are full to the brim, but you notice one that is entirely empty. You'll be able to get a seat to yourself! You step onto that car. The air conditioning is broken, and someone has defecated on the floor. 2. The Juggling Contest: Your quantitative trading firm is holding its annual juggling tournament. Cost to enter is $18, winner takes all. You know you're far better at juggling than most of your coworkers, so you sign up. As it turns out, only a few of your coworkers sign up, including Alice, who used to be in the lucrative professional juggling world before leaving to pursue her lifelong passion of providing moderate liquidity to US Equities markets. You come in second place. 3. The Bedroom Allocation: You're moving into a new two-bedroom apartment with your roommate Bob. You've each had a chance to check out the apartment, and Bob asks you which bedroom you prefer. Both looked equally good to you, so you tell Bob that he can choose. Bob chooses a room, and lo and behold, you find upon moving in that your room has less closet space and worse flooring. You realize you would have been better off flipping a coin (giving you a 50% chance at the better room) instead of leaving it up to Bob. 4. The Thanksgiving Leftovers: It's the Sunday after Thanksgiving, and dinner is leftovers. You recall that your family's Thanksgiving meal was delicious, so you're excited to eat more of it. You get to the table, and find that the only food left is Uncle Cain's soggy fruit salad - all of the yummy food has disappeared over the weekend. 5. The Wheelbarrow Auction: At the town fair, a wheelbarrow is up for auction. You think the fair price of the wheelbarrow is around $200 (with some uncertainty), so you submit a bid for $180. You find out that you won the auction - everyone else submitted bids in the range of $25-$175, so your bid is the highest. After paying and taking your new acquisition home, you discover that the wheelbarrow is less sturdy than you'd estimated, and is probably worth more like $120. You check online, and indeed it retails for $120. You would have been better off buying it online. 6. The Wheelbarrow Auction, part 2: At the town fair, a wheelbarrow is up for auction. You think the fair price of the wheelbarrow is around $120 (with some uncertainty), so you submit a bid for $108. You find out that you didn't win - the winning bidder ends up being some schmuck who bid $180. You don't exchange any money or wheelbarrows. When you get home, you check online out of curiosity, and indeed the item retails for $120. Your estimate was great, your bid was reasonable, and you exchanged nothing as a result, reaping a profit of zero dollars and zero cents. 7. The Laffy Taffys: Laffy Taffys come in four flavors, three of which you really like. Your friend Drew is across the room next to the Laffy Taffy bowl, and you ask him to throw you a Laffy Taffy. (You don't want to ask him for too big a favor, so you don't specify flavor - you figure you're 75% to get a good one anyway.) He reaches into the bowl and draws a Laffy Taffy and tosses it to you. It's banana. 8. The Field: You want to invest in real estate, so you go to your field-owning friend Ephron and submit a market order for his field. You: I would like to buy your field. Ephron: My man, it is all yours. Take it. You: No, I want to pay dollars for it. I will pay whatever it costs. Which is how much, by the way? Ephron: Oh, I guess, if I had to put a price on it, hrm, maybe $400 million? What's $400 million between frien...

Mar 14, 2024 • 18min

AF - More people getting into AI safety should do a PhD by AdamGleave

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: More people getting into AI safety should do a PhD, published by AdamGleave on March 14, 2024 on The AI Alignment Forum. Doing a PhD is a strong option to get great at developing and evaluating research ideas. These skills are necessary to become an AI safety research lead, one of the key talent bottlenecks in AI safety, and are helpful in a variety of other roles. By contrast, my impression is that currently many individuals with the goal of being a research lead pursue options like independent research or engineering-focused positions instead of doing a PhD. This post details the reasons I believe these alternatives are usually much worse at training people to be research leads. I think many early-career researchers in AI safety are undervaluing PhDs. Anecdotally, I think it's noteworthy that people in the AI safety community were often surprised to find out I was doing a PhD, and positively shocked when I told them I was having a great experience. In addition, I expect many of the negatives attributed to PhDs are really negatives on any pathway involving open-ended, exploratory research that is key to growing to become a research lead. I am not arguing that most people contributing to AI safety should do PhDs. In fact, a PhD is not the best preparation for the majority of roles. If you want to become a really strong empirical research contributor, then start working as a research engineer on a great team: you will learn how to execute and implement faster than in a PhD. There are also a variety of key roles in communications, project management, field building and operations where a PhD is of limited use. But we believe a PhD is excellent preparation for becoming a research lead with your own distinctive research direction that you can clearly communicate and ultimately supervise junior researchers to work on. However, career paths are highly individual and involve myriad trade-offs. Doing a PhD may or may not be the right path for any individual person: I simply think it has a better track record than most alternatives, and so should be the default for most people. In the post I'll also consider counter-arguments to a PhD, as well as reasons why particular people might be better fits for alternative options. I also discuss how to make the most of a PhD if you do decide to pursue this route. Author Contributions: This post primarily reflects the opinion of Adam Gleave so is written using an "I" personal pronoun. Alejandro Ortega and Sean McGowan made substantial contributions writing the initial draft of the post based on informal conversations with Adam. This resulting draft was then lightly edited by Adam, including feedback & suggestions from Euan McLean and Siao Si Looi. Why be a research lead? AI safety progress can be substantially accelerated by people who can develop and evaluate new ideas, and mentor new people to develop this skill. Other skills are also in high demand, such as entrepreneurial ability, people management and ML engineering. But being one of the few researchers who can develop a compelling new agenda is one of the best roles to fill. This ability also pairs well with other skills: for example, someone with a distinct agenda who is also entrepreneurial would be well placed to start a new organisation. Inspired by Rohin Shah's terminology, I will call this kind of person a research lead: someone who generates (and filters) research ideas and determines how to respond to results. Research leads are expected to propose and lead research projects. They need strong knowledge of AI alignment and ML. They also need to be at least competent at executing on research projects: for empirically focused projects, this means adequate programming and ML engineering ability, whereas a theory lead would need stronger mathematical ability. However, what real...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app