I am all for making AI better and more aligned with things which are good for people. I don’t really like the term AI Safety though because it seems to encompass a lot of different things which need their own proper attention.

Stuff like:

  1. Bias in models and bad outcomes that result (ie. Weapons of Math Destruction)
  2. Ability of models to give people dangerous info (how to make a bomb or whatever)
  3. Concerns around privacy
  4. Loss of Jobs and/or increasing inequality
  5. Concerns that an All Powerful AGI will attack/destroy/enslave humanity

These are all important but they range in practicality from Already Happening to Sci Fi Fantasy.

We do need to be thinking about this stuff. Items 1-3 feel like they can be addressed by developing the right rules and best practices, but that only will happen if people make the effort and understand the issues.

Number 4 is a bit harder to think about. I recently read Power and Progress which has some good ideas here but the voices of the tech utopians are pretty loud and don’t require anyone to actually do anything besides continue to worship them. This is a long term political project which can’t just be solved with better engineering.

I am not saying that is impossible and I enjoyed Mrs Davis, but I am a lot more worried about a case where some human actor uses AI against other people than worrying about Super-aligning a future AI so it doesn’t decide to go against us. I am skeptical of how well this could even work, in part because it seems like a hard problem but also because I don’t trust the people who are working on it. A bunch of tech lords sitting around deciding what a pro-social alignment would be for the god they are building doesn’t seem to be like its going to work out well.

I don’t really have a suggestion for better terms. I am just sad that when I read something about AI Safety concerns, it takes a lot of effort to know whether we are in talking about DAIR or LessWrong.






