Adventures with Deep Research

How my AI report on housework started well, then went off the rails

I’ve long been interested in the topic of housework, as you can see from this Crooked Timber post, which produced a long and unusually productive discussion thread [fn1]. The issue came up again in relation to the prospects for humanoid robots. It’s also at the edge of bunch of debates going on (mostly on Substack) about living standards and birth rates.

I’m also interested (like nearly everyone, one way or another) in “Artificial Intelligence” (scare quotes intentional). My current position is, broadly, that it’s what Google should have become instead of being steadily enshittified in the pursuit of advertising dollars. But I’m alert to other possibilities, including that more investment will deliver something that genuinely justifies the name AI. And I think a lot of the concerns about power and water use, the spread of AI slop and so on are either overstated or (as with deepfakes) are mostly new iterations of concerns that always arise with new IT and communications technology, and can be addressed with existing conceptual and legal tools.

With this background, I thought it would be interesting to try out ChatGPTs Deep Research (DR) on the question of what has happened to housework and why. As I may have mentioned before, I’ve trained DR on a big volume of my own writing. That produces a passable imitation of my style, and means I don’t worry about the ethical issues of plagiarising the writing style of others (of course, standard norms of citation and attribution still apply).

I decided to focus on single-person households, to abstract away from the issues of child-raising (which I want to look at separately) and the allocation of work between partners (about which there is a vast literature to which I can’t add anything new).

Everything went really well to start with. I prompted DR for time use data, then pushed further on with more detailed questions like the impact of air fryers on male cooking habits (I was given one recently and was impressed enough that I promptly bought a second). I asked for a literature search and got references to Judy Wajcman and Michael Bittman, both of whom I knew and a couple of people I didn’t. DR missed Ruth Schwartz Cowan’s classic More Work for Mother.

On the other hand, I wasn’t aware of Wajcman’s recent Pressed for Time and hadn’t thought about the broader issue of life administration, which DR pointed out. I gave it a more economistic take, trying to divide labour-saving innovation (electronic bill paying) from the labour costs of more digital consumption (retrieving passwords for streaming services etc).

I got DR to produce a LaTeX file, and was nearly ready to go to digital press when I noticed that the references were incomplete. At this stage, the whole process spiralled into disaster. Every draft seemed to lose more material, and to be worse written. Finally, I demanded an explanation\

Source: Enkiai. com

DR graciously explained that this is the way it does things. Here’s its response with original formatting (more detailed version here)


Why Deep Research Fails

The core issue: The “illusion of continuity” in long-context LLMs

Large-context models appear to preserve every part of the conversation, but in reality:

They do not reliably retain or retrieve long-form outputs they themselves generated.

Even though the conversation window may be large, the model often cannot:

  • accurately locate the version you consider the “full text,”
  • reliably differentiate between multiple drafts (short, medium, long) created earlier,
  • or accurately regenerate from earlier output unless it is verbatim re-supplied by the user.

This is because the internal contextual representation compresses earlier tokens. Information is not stored sequentially; it is represented in a latent space, which cannot always reconstruct past outputs faithfully.


Fortunately, my LaTeX program took continuous backups, so I’ve been able to retrieve the last good version, but I’m going to keep it away from DR for now.

I was going to go on with more detail about the actual report, but my op-ed training leads me to feel that a post should have 700 words, and I am at 675 as I type this. 

fn1. I can take a victory lap on my jihad/crusade against ironing, which has disappeared almost entirely, contradicting the expectations of many commenters.

Will fewer kids mean fewer scientists

I’ve been seeing more and more alarmism about the idea that, on current demographic trends, the world’s population might shrink to a billion in a century or two. That distant prospect is producing lots of advocacy for policies to increase birth rates right now.

One of the big claims is that a smaller population will reduce the rate of scientific progress I’ve criticised this in the past, pointing out that billions of young people today, particularly girls, don’t get the education they need to have any serious chance of realising their potential. But it seems as if I need to repeat myself, so I will do so, trying a slightly different tack

It’s surprisingly difficult to get an estimate of the number of researchers in the world, but Google scholar gives us a rough idea. Google Scholar indexes research across all academic disciplines, including social sciences and humanities. No exact count is available, but I’ve seen an estimate that 1.5 million people have Google scholar profiles. I’d guess that this would account for at least half of all active researchers, for a total of 3 million.

www.isaaa.org Licensed underBY-NC-ND

Read More »

Musk’s last grift

The US is one big grift these days: the Trump Administration, traditional and social media, corporations, crypto, financial markets are all selling some kind of spurious promise. It’s hard to pick the most egregious example. But for me, it’s hard to go past Tesla. Having lost its dominant position in the electric car market, the company ought to be on the edge of delisting. Instead, its current market capitalisation is $US1.33 trillion ($A 2 trillion). Shareholders have just agreed on an incentive deal with Elon Musk, premised on the claim that he can take that number to $8.5 trillion.

Having failed with the Cybertruck and robotaxis, Tesla’s value depends almost entirely on the projected success of the Optimus humanoid robot. There’s a strong case that Optimus will be outperformed by rivals like Unitree But the bigger question is: why build a humanoid robot at all?

Read More »

Armistice Day

107 years ago*, the guns fell silent on the Western Front, marking a temporary and partial end to the Great War which began in 1914, and has continued, in one form or another, ever since. I once hoped that I would live to see a peaceful world, but that hope has faded away.

  • fixed my arithmetic error, noted by several readers – I seem to be getting worse at this. Also, the date is 11/11 in Australia, where I’m writing.

What are “rusted-on” Labor voters thinking ?

And will they stay loyal?

When I first started following politics, in the dying days of the McMahon LNP government, most people voted consistently for one or other of the two major parties, Labor and Liberal/Country. The only important exception, the DLP, was just about to disappear. At that time, the common view was that consistent party voters were acting out of habit or class/cultural identity, while “swinging” voters made a considered choice, based on policies, candidates and so on. Since both major parties got close to 50 per cent most of the time, appealing to these swinging voters was seen as the crucial task.

Once political scientists started looking however, they found out that swinging voters were typically the least engaged group, paying little attention to politics or policies and making late choices based on frivolous issues or just “vibes”. The lesson drawn by party “hardheads” (a term I routinely deride) was that the best strategy was to off attractive goodies that would tempt these voters. A better response, though, was to convert consistent voters from the other side. Labor managed more or less the opposite of that with the 1956 split, losing supporters to the DLP who mostly ended up voting Liberal when the hope of (re)gaining control of the ALP was abandoned.

That was then. Now with the combined vote of the majors down to 60 per cent in the latest Newspoll, the dissident 40 per cent is very different from the swinging voters of yesteryear. But what about the “rusted-ons”? It looks as though the LNP has just about lost its core base, going either to One Nation and other far-right parties or to centrist independents. But Labor still gets the support of more than 30 per cent of voters nearly all the time (the low point, I think, was 25.5 for the utter trainwreck of NSW Labor in 2011).

What are they thinking about and will they keep thinking the same way? Thinking about why someone might choose to vote Labor consistently, there are a few possibilities.

First, there are voters for whom Labor’s policies have been and remain closer to their own preferences than any other option (LNP, Greens, independents). Given Labor’s shift to the right on most issues, that would include voters who were always on Labor’s right flank. Another group would be voters focused on the relatively few issues, such as union rights, where Labor has sustained a relatively strong position, along with a historical record. But it’s hard to see this group as being very large.

Next there are those for whom voting Labor is a matter of personal/cultural identity, similar to cheering for (to pick an example not exactly at random) South Sydney in the NRL. No one backs the Rabbitohs because they approve of the current coach’s preferred style of play, and very few because they like particular players. It’s just something you are born with, or pick up along the way. To the extent that this is a view formed by growing up a traditionally working class environment where voting Labor is taken for granted, it’s eroding over time.

Third are those who don’t consider any alternative to voting for a major party. In large measure this reflects misunderstandings about the nature of preferential voting, misconceptions shared by a surprising number of political commentators and very much encouraged by Labor. In 2022, for example, Labor ran with the (totally false) claim “voting 1 for Labor is the only way to get rid of Morrison”.

A variation, relevant with a large crossbench, is a belief that minority governments are problematic and unstable. This remains the default position of the commentariat, who have built their careers in a two-party system, and don’t know how to handle anything else. But as minority governments become more common, and function fairly effectively, this belief can’t be sustained indefinitely.

As I’ve argued in previous posts, Labor’s positioning on the centre-right makes it the natural party of government as long as the main opposition comes from a fragmented right and centre right. In the short medium term (say the next 5-10 years), this is a “hardhead” strategy. But it has already alienated most people who actually care about positive policy outcomes, and is making Labor identity more and more difficult to sustain. Sooner or later, Labor’s base will contract to a point where it can no longer sustain the idea of a binary choice. At that point, the party will have little left to offer to anyone.