I have argued for awhile now that the probabilistic nature of LLMs can represent a form of context that, when applied, can be utilized for robotic applications.

Seems I’m not the only one that had this idea. While simple, Microsoft researched applied a high level control library to demonstrate LLMs (ChatGPT) developing robotic task code.

    • @hlfshellOP
      link
      111 months ago

      This isn’t mine, it’s just an interesting blogpost I came across. Nor am I arguing that it should replace a robotics engineer.

      My main thought, not fully represented in the post, is that LLMs can act as a context engine for high level understanding of instructions + spatial awareness, and then apply it to actuation. This is somewhat touched upon in the article.

      I do think that there is some interesting work in LLM powered task level planning. I’m hoping to find the time put together a good example of this, utilizing the ability for LLMs to make logical leaps based on instruction. In the article, it took the command “I’m thirsty” to mean move to a drink. In a more applicable application, we can use LLMs to identify that a room with multiple identified objects (refrigerator, oven, stove, cabinets, etc) is in fact a kitchen. Then, from there, determine that “I’ve seen a room I’ve identified as a kitchen - I can navigate there to attempt to find a drink”.