Lessons Learned From The DAIN Experience

Lessons Learned From The DAIN Experience

It has been over 4 years since I touched DAIN/DIANA so this article is long overdue but this past couple months I have been digging back through the experimental code. It was very spaghetti as it was experimental and I was tweaking it while working on many other things. I did a quick tally of just the .cs files in the main solution (not any of the included libraries or third party or ones I created over the years) and I came up with between 7 to 8 MB of just c# source code.

There are lots of lessons learned and it will take a while to go through and pull out out all the useful code and lessons learned from the research but here are some obvious ones for people looking to build out their AI datasets. These 3 things helped a lot to keep my dataset down:

  1. Expand all contractions. If you do this on both the statement you are processing and anything you store then you reduce the number of synonyms you need by a lot.
  2. Always compare using singular to singular unless it is absolutely necessary for context.
  3. Define all the ways to say Yes, No, Maybe. There are hundreds of ways to say Yes so if you build those three words out properly it will save you a lot of time. In most systems people are looking for an answer and often it is one of those 3.
  4. Domain and Context is key. From the very beginning I made sure that everything I stored had a domain and context if possible. Simple stuff will not have context but if you can take the user statement and apply a domain and context to it quickly then you reduce vastly the amount of your dataset that contains the answer. The more context you apply the smaller that dataset.

Remember the W5 as your grade 4 teacher probably told you. If you take each entity and apply as much of the W5 to them as context then as you start to relate things to it you already have built a critical amount of knowledge around each one. The user requests you receive are going to be one of the W5 and How.

Hopefully this helps.

If you are going down the road of building your own AI experiments not using other's but from the ground up please reach out maybe we can share ideas.

My current journey is very much in the domain I know. I am working on the Digital Developer. This would not be a replacement for traditional development but more of an assistant. This would be a team of developers with different personas based on their skillset. So if you talk to a Java digital developer they will always want to code the solution in Java and tell you everything else sucks. Same for Python or C#. There is so many things happening in technology and for one person to know it all is difficult but with composable that is how it is going. Hopefully the Digital Developer will learn it for you and then you can then have those experts at hand to ask those tough questions 24/7.

Now if you are interested in working on that definitely let me know and maybe we see if we could work together on it.

Right now I have the following personas:

  1. Dee Dee is the full stack. She has no domain or context so it will take her a while but she will answer in a more broad way
  2. Dee Dee C. Sharp. Well she knows C#
  3. Dee Dee Cappuccino. You guessed it Java
  4. Dee Dee Pascal. Yup Pascal.
  5. Dee Dee Powers. Yes Powershell but it could have been powerbuilder
  6. Dee Dee Fox. Foxpro but not sure anyone is using it still
  7. Dee Dee Python.
  8. Jan Crowe. This is an ode to the person I know who knows databases more than anyone else I know
  9. Shelley The Sailor. This is the docker expert and yes it is named after that famous docker expert.

There are more to come but this is my starting point. Right now they are just learning to read code and then produce documentation on what it learns but as they grow they will start being more productive especially as I start to pull out more from DAIN/DIANA.

.

要查看或添加评论,请登录