IoT .. from sensors to back-end database/application .. and a lot more
Mike McKean
Business Development & Project Manager. Technical Adviser to Dstl. Former Chair IoT Committee at IET. Trained in Strategic Sales & Communication. Projects from Stage Gate through to Production. Army Royal Signals Veteran
Internet of Things (IoT) in Oil & Gas Production … from sensor to communications, through to back-end databases ,and using concrete for temperature management … and anti-hacking solutions … and a lot more ??
The environment
Heat
Many systems are certified to work at normal room type temperatures, others at -40 C to +70 C. As you limit your choice in terms of operating temperatures, the range of products available dramatically reduces. So, it can be tempting to use air-conditioned cabinets/rooms?
Mounting your system in an air-conditioned equipment cabinet, just added at least several more links in the “break/risk chain?”. What happens if the power to your air-conditioning system fails, or the fan fails, or the unit gets full of sand, or dust? You just added several weaknesses into your design, and increased the number of regular maintenance visits needed.
Better to design your solution so that all heat dissipated is calculated for worst possible values and is naturally dissipated within your overall design. No moving parts.
Cold
Most robust systems will operate between -40 C and +70 C. Again, you can use concrete to regulate temperature changes. However, heat generated is still a problem?
Be aware that wires, circuit boards and everything else can crack, when temperatures change quickly. So, don’t leave equipment cabinet racks open, thinking that any heat generated will warm up the room.
Instead treat the cabinets exactly the same as those in a hot environment. Keep them sealed. Then if there is an equipment failure, the heat already inside the cabinet will help to slow down the change in temperature. Thereby slowing down any physical contractions of the cables, wires or circuit boards.
Concrete for temperature management
If you install your computer/networking systems in a larger metal cabinet, you can further regulate the heat during the day, and colder temperatures at night by using a block of concrete? Honest – it’s true.
No moving parts. It’s also a good weight to hold a metal cabinet solidly in place.
Failures through Temperature Changes
Mobile SIM Cards (heat/cold)
Did you know that thicker special SIM cards (thicker I.C.) should be used in machines (computers/gateways/IoT devices) where they will be operating in very hot or cold areas. That’s specifically to help manage heat/cold changes and help prevent the SIM I.C. card from cracking?
Solder Joint Failures
Once a board has passed testing it’s not guaranteed perfect forever? Solder joints and PCB’s get stressed, and fail. (let me explain)
When the environmental temperature changes, the solder Joints and the PCB they connect parts to, change shape at different speeds. The faster you move from -40 C and +70 C, then the more severe the differences are. Those solder joints will develop “micro cracks”.
Those “micro cracks” will mean that the voltage supplied to the components will very rapidly change, at very high frequencies, i.e. depending on the size, shape of the “micro cracks”.
Try explaining those faults to someone? Worse still tracking DC voltage levels with multi-meter will not show up those micro cracks.
To minimise such failures, slow down any temperature changes. e.g. going into an equipment room, where everything is freezing cold and starting things up every morning and switching them off every night, is just a classic way of causing damage.
It may not be practical to leave things running 24 hours a day, but at least do something to slowly warm up the cabinet or switch room, before taking components from -40 C and +70 C and at the same time vibrating them?
Something as simple as switching on an inexpensive warm air fan in an equipment room, an hour before use will help. But please don’t go from minus a lot, to very hot, in a couple seconds ? Yes you might probably get away with it, (for a while?), but why do it?
Electricity Generation
Sub Station
I understand electricity sub-station system communication, i.e. the various communication protocols used; IEC 60870-5-101/104, DNP3, Modbus, and equipment requirements such as IEC 61850-3.
I also understand the value of fibre versus traditional cabling, and also electro magnetic interference from the power cables.
Having said I understand the above, my knowledge or electricity is helping re-fuel, or switch over from one generator to another. Yes, I can do some calculations about AC, DC Power etc, but I don’t know much about this area.
Diesel/Petrol Generators
What’s best to use in an Oil/Gas environment? Is it diesel because it’s less combustible than petrol, or is petrol easier to source, in remote areas? What type of generators does your site have now, and is it worthwhile standardising on one type, to make maintenance easier. Or should you have completely different systems, i.e. Diesel and Petrol, just in case one or the other is easier to source?
Solar Panels
Very sensible to have solar panel for electricity generation, especially if you are only using them for IoT devices. You can likely supply power for several ?eviecs from one small solar panel, and use a local battery for resilience, and also power during the night.
Piezo Electric
Power up your IoT devices using the vibration from ?your drilling rig? See my video above.
Computing & Networking
I am going to split this up, starting at the sensors, through the gateways, then cloud, dashboards and applications. I’ll include SCADA, as a general topic at the beginning of this.
SCADA
Mike’s explanation of SCADA;
SCADA is the connection between a centralised controlling application and the electronic devices that make up an operational environment. This central application should be able to read data from these devices and also write instructions back to them. As much as is possible, read and write operations should be done in real time.
Note. Modbus is a master/slave protocol, so is definitely not “real-time”, as it relies on the Master scheduling the polling. Similarly, IEC 60870-5 is the same, unless wired up in full duplex. Despite these two issues, Modbus and IEC 60870-5 are used through-out SCADA networks. Having made these two points, I like and use Modbus and IEC 60870-5-101/104 (in full duplex).
Please don’t be daunted when you walk into a SCADA environment. It’s only daunting because the operators started building it twenty or thirty years ago, and it’s likely to be critical to the company operations. Just learn what the various parts do, and it will end up being fairly straight-forwards.
One major thing to consider with SCADA systems is “cyber security”. There are three websites I use, which will be well known by security people in your company. If you are not aware of them, have a quick look;
I was shocked when I looked at some PLC’s and sensors from major manufacturers. The sort of things you should look for are the underlying operating system(s), e.g. are they Windows XP, Vista, Server 2003 - etc (don’t laugh – I’m serious) and also any JVM’s etc, often found in simpler sensors. Check they are not deprecated or end of life.
Then try searching for engineers dev kits or maintenance kits. You will be shocked to find “helpful” software kits you can freely download (from the manufacturers websites), to help engineers “see” things that the normal user cannot “see”. That’s especially a problem with wireless, blue-tooth etc. See this video I made to lock-down Bluetooth systems
https://vimeo.com/763705921
In any large, important and complex environment there will be different manufacturers equipment, with different hardware, different software, different firmware, and who knows if it is all locked down properly?
If I was running a SCADA environment, I’d do a site survey, find out what I have, just in terms of manufacturer’s systems, Then deploy agreed builds. That may not be perfect, but at least I’d know what I had. The next thing would be to try and compartmentalise the network, to isolate any breach. That’s a real problem as it goes against the whole concept of SCADA.
Or here are my thoughts about securing SCADA;
Have an agreed or accepted list of what inputs a sensors/machine can send to anything else. Sounds simple? Bear with me.
All inputs (like aforementioned) would have their data input sanitised? E.g. why would you trust that the integrity of a single datapoint is 100% correct? Use a gateway with a local program to cross-check the input of one sensors against several others? Here’s a simple example. If you have five sensors on a train. One measures the engine revolutions per second. (bear with me). Another the train speed based on GPS signals. A third sensor measures the train vibration and compares that vibration signature against known signatures and all three sensors suggest the train is traveling at 100 km/h, then why would you open the door, because another sensors says it’s safe to do so?
Then do the same, but in reverse with regard to a central application writing a command to a sensor? Maybe it’s not a good idea to open a sluice gate if that means you flood the valley? Do the same as above, do some data integrity checks, to make sure no-one has hacked the system.
So you now have some basic, simple controls in place. But the hacker is better than your average engineer? He does this for big money and/or big thrills!?If the hacker succeeds just once, you are in a bad place.
Now you need to review why and what you measure. What you write to what sensors?
Start with a new “read only” overlay. Something that is designed such that it is physically impossible to write commands to sensors, but that can read data. Keep everything separate from what you have now, so it does not interfere.
Now you have a second “read only view”. Importantly it’s a new, and completely unbiased view. This new view will tell you a lot.
This is the basis of your new SCADA system – over to you.
Sensors
Some sensors connect using wires, radio, using various serial protocols and often a few bits or bytes are sufficient to supply the necessary data needed.
Sensors could be measuring flow, pressure, temperature, humidity, density, weight, in fact just about anything.
Is a sensor difficult to connect to? Could you make your own?
Gateways
They sometimes can include some type of sensor, but are usually a concentrator for connecting multiple sensors, or they might be a way of bridging or integrating different vendors networks, or systems together.
They will usually support multiple different types of connectivity and different protocols. The more powerful gateways will include a local programming environment, to support “edge or fog computing”.
The gateways tend to be the point at which cloud companies or system integrators might move you to their preferred cloud. In many situations to try and move to an alternative cloud at a later stage can be high risk, expensive, or impractical.
Always approach the choice of gateway with great care to avoid proprietary lock-in.
Cloud
I have a bias toward Amazon Web services. That’s because I understand it, I have some AWS certifications and I use it quite a lot.
I have worked with customers who had small on premise servers and workstations, others who have a mix of physical and virtual systems, and others who are mostly resident in a cloud. I think the most important part of your cloud is your data. After all you can always re-build servers, but without your data – you have a real problem.
So whatever you choose for your cloud, find a way to secure access to and the ongoing integrity of your data.
Dashboards
A dashboard is different from an application, in as much as a dashboard usually just displays information.
Applications
This is the part that obviously adds the value. But the hardware and software engineers need to work closely together, or better still you need a Project Manager, who understands both. Then you will find that you can realise far more value from your system, from end to end, than you realised.
But you should consider having an application installed locally on the IoT gateway, as well as your back-end. This is especially important for Health & Safety, and where any outage/issue, might have a serious impact
For instance, if you are managing a system but reliant on a back-end (Cloud based) application, what happens if your communications link fails? This is where an application installed on the IoT gateway, can take over and make sure your environment continues to operate as required.
Communications Protocols
Let’s look at other essential industries. Electricity & Water use; DNP3, IEC, 60870-5-101/104 and ModBus. Oil/Gas industries use; HART, Profibus & Foundation Fieldbus/IEC 61158 and Modbus.
Modbus revolutionised the industrial control environment. In 1979 Modicon offered it free to use. Compare that with the computer world, where Linux generally replaced Solaris, HPUX and AIX around the early 2,000’s. So, arguably industrial control communications were twenty years ahead of the enterprise server world.?(yes Linux was around a lot earlier, but it did not replace the mainstream Unix systems until the early 2000’s. I know I was there, working in the big banks).
But Modicon did not impose any controls. Many manufacturers changed how they implemented Modbus, and some for good technical reasons. The result was that Modbus is almost everywhere, but you have to be careful to double check the device Modbus characteristics.
As heavy industry started moving to field protocol technologies, many different “user groups” sprung up to document their own protocols.
At the same time many large (very large) manufacturers offered their own proprietary communication protocols. The cynic in me thinks they did that to “lock” businesses into their technology. (see next para).
Over the years I have found that just about every manufacturer from every different industry uses the exact same communication chips (I.C’s) but they might change the packet format a little. The result is that some businesses are frightened to use any other manufacturers products, because they fear they will not “work”.
Right now (June 2020) I am building a test system. It will connect to just about every possible communication protocol, using just about every physical and radio type medium. Before you scream “it must be expensive”, I reckon it will cost me maybe $3,000. It’s not a box of bits either. It’s just a couple different companies boxes and some third party boards and some other tools. I am right now waiting on the various parts arriving. Then I reckon it will take me two months to get it working.
Decision Support re Stopping Drilling?
How much does it cost to stop a drill? Or - how much did you just cost the business because you did not stop the drill, and it broke?
People talk about using Machine Learning (M.L.) and Artificial Intelligence (A.I.) to help us make such decisions. But the normal M.L. or A.I. solution needs huge historical amounts of data and huge and fast compute resources to very quickly take a decision on our behalf, or raise an alarm/alert.
Why is this so difficult? Let’s remind ourselves how computers work? If we don’t properly understand that, then how can we understand M.L. or A.I. ? (read on, but please grab a coffee !)
Computer Architecture
Computers share resources and so they have bottle-necks, for instance a single data bus. ?There is contention to gain the attention of the CPU, from hardware interrupts to scheduling requirements. e.g. resources are unavailable as they are scheduled to do task A, and are unavailable for task B.
Virtual Machines (e.g. VMware) and Virtual Operating Systems (e.g. Parallels) and Containers (e.g. Dockers) all still need hardware to run on, so they are affected by below.
Speeding a Computer Up (making the clock pulses faster?)
We need to agree on some concepts. Here’s a few pictures/diagrams.
The clock in a computer system is used partly to keep the CPU working. If you have power on a CPU, but no clock, the CPU does nothing. So, for example a clock at 100 Hz triggers the CPU to do something every 100th of a second (almost – that is not 100% correct, as one clock cycle does not normally equal one machine cycle – read on).
A CPU might take two or three clock pulses time duration to read in a new instruction.
The CPU might take a total of five or eight clock pulses to execute an instruction. So, a clock pulse of 100 Hz does not mean that the CPU executes 100 instructions per second. Now scale up to GHz, it’s just the same.
An exceptionally fast clock speed does not mean that a CPU runs at that speed. It all depends on the CPU design, and is that CPU designed well to execute those specific types of instructions.
Over Clocking
It can be effective, but if you over-clock you can slow a system down. Even at the CPU level, if you over clock a CPU, then the result of the instruction may “miss” the relevant clock cycle.
In the example below the “active” clock pulse edge is the “rising” edge. Really clever/fast systems will be set up to do something on multiple parts of the clock pulse. For instance, they will do something at 63% of Vcc on the rising edge, something else at 100% Vcc, and then something else on the falling edge (maybe) at 37% of Vcc. (assuming Vcc is your max voltage level on the clock pulse)
None of that takes into account what the peripheral devices are doing. i.e. your mega fast application is doing all it can to compute results, but your peripherals are just not fast enough to communicate to other systems. That might be because you are using the wrong hardware, drivers, networking or messaging protocols.
Using the wrong type of messaging protocols is probably the worst possible technical and commercial mistake. You can build an application that just cannot possibly pass all the data it needs to, or cannot adhere to a required QoS.?(this is a big area, too big for this short paper).
领英推荐
Over Clocking Diagram
Singlethreading (or I prefer sub-routine, or task)
Start with one CPU in which that CPU executes a series of instructions sequentially. All RAM etc is solely dedicated to that program. That is the most simple single threaded execution process.
Multiprocessing (not multithreading)
From what I can see this is not used very often. I will explain it as two or more CPU’s which use shared physical Memory, Storage, Input/Output.
My personal thoughts are this could be useful where there is no need for immediacy in your I/O, but you have some seriously large, long running computations to take place. Then when you have computed your answer, you would use the shared I/O.
I think the advantages would be that the physical size of the “box” is smaller, as it only needs one set of I/O.
Multithreading
Typically, nowadays this is a CPU with multiple cores. The operating system needs to be designed to support multi-core operation. I researched this and typical use cases included the use of multiple, non-conflicting applications running separately. Some might be ant-virus checking, or graphics work. i.e. applications that could co-exist without affecting each other.
I then looked into using multi-threading for low latency, or high throughput. Remember although not regarded as an Academic resource, I like Wikipedia, as a starting point. You just need to be aware that some companies use it to promote their own solutions. I have contributed to Wikipedia, I do find it useful. Ok explanation about Wikipedia over. Please see https://en.wikipedia.org/wiki/Multithreading_(computer_architecture)
Shared Resources. If not set up correctly there can be problems where the different cores try to access shared resources, see explanation above about translation lookaside buffers (TLBs).
Racing. This occurs when two or more threads try to access the same memory location, at the same time, to change the value. Both multithreads access the same data and then perform their computation and then try to set the new value with their own computation. Depending on how your algorithm is set-up, e.g. does it check for “racing” or not, then you can get into a “race-loop”. I just made up the term?“race-loop” .. ???
Map Reduce
I sold a very large Map Reduce solution to someone, for a very big project. I am changing the example. But it is still relevant, please read on. ?
Map reduce as I explain it is where you might have some large computations to be done. Each computation might involve a small amount of data to be manipulated, but you might need a large amount of information from other sources to be used, to compare and contrast that new data against.
So, in the Map Reduce scenario I am describing, this customer had many thousands of physical servers and on those physical servers they had many vm’s. Each vm could do some computational work, and relied on the physical power of that server to do that work quickly. Yes, again back to my single threaded solution.
The vast amounts of information that had to be analysed to produce a result were stored in very fast local storage in those physical servers. So, the computation was done quickly.
?Map Reduce - an Example
This is where you might have a change in a local reading. The drill-string is now showing characteristics A, B, C, X, Y and Z and doing that over 20 milli-seconds.
Let’s imagine an environment, as follows;
·????????Drill string characteristic A = 123
·????????Drill string characteristic B = 456
·????????Drill string characteristic C = 789
·????????Drill string characteristic X = 111
·????????Drill string characteristic Y = 222
·????????Drill string characteristic Z = 333
·????????Time for changes = 20 ms-1
A=123, B=46, C=789, X=111, Y=222, Z=333, Time=20 ms-1
All of the above needs a massive amount of information to be mined. What Map Reduce does is that the master program is set up to look for things such as a “Drill string characteristic A = 123”. Then the master program sends out a simple payload = “Drill string at Location A=123, B=46, C=789, X=111, Y=222, Z=333, Time=20 ms-1 “
These thousands of remote machines will have huge amounts of other data, such as what the affect the sea temperature had last time, or the density of sand, or the type of material being drilled through – previously. These thousands of machines will all have different programs to run, to compute different scenarios. They will then return their answers to the master system.
You might think that sending out that message, maybe over a network takes a few milliseconds which (yes) involves a few milliseconds delay, but those thousands of vm’s now do a lot of calculations, and they will all do separate smaller parts of the overall calculation.
So maybe you only really had ten separate, parallel calculations to be processed. But you sent (duplicated) that transmission maybe a hundred-fold. Why do what?
Those remote machines send back their answer to the master machine. The master machine then “Reduces” or reads in the messages. It should have multiple answers for each of its small sub-routines. Depending on how clever your master system is, it will either just compare incoming answers to make sure they are the same, and so the result is trusted. Or they might do a little more work to compare the other inputs and come out with a final result.?
This type of “Map Reduce” is well suited, where WAN/Network links are excellent.
For drilling type environments, I have a different solution.
Storage
I am just going to consider solid state storage, on non-volatile memory, i.e. not spinning disks. I am not going to consider things like caching or “commit before written” to non-volatile storage. For the purpose of the small section on storage, I am considering data that is stable, in non-volatile storage.
I’ll highlight what appears to be a new, ultra-fast new storage. As part of my research for this, I just came across this new RAM type technology that does not need dynamic pulses to keep it charged, and it uses 1% of the same power as regular RAM, and it’s non-volatile. The problems are it is still in R&D and secondly will likely be expensive. But let’s look at it;
It’s called “UltraRAM”, but it does not seem to be the same “UltraRAM” that Xylinx offers, and I mention that as the Xylinx product has been available since 2016, but this is just going through final R&D now (2020).??https://www.nature.com/articles/s41598-019-45370-1
Ingesting Data to your Database or Application?
Here’s something to think about? ?If you want ultra-low latency performance, do you ingest new data straight into your application, or do you ingest the data to your database? It’s faster to write/insert/update straight into your application, but you have not backed it up yet? Here’s an IoT example;
I build things in Amazon/AWS, and I like it a lot. I build IoT solutions. One easy to use process, is to route the data based on what is called the “topic”. One of the easy/fast ways of using that incoming data is to direct it to a Database/Table, and that’s excellent. Stay with me a bit longer please. The standard method of backing those tables up is to S3 storage, and they are called “buckets”. The standard back-up frequency of your S3 bucket is once every 15 minutes. OOPS, what happened for the 14 minutes and 59 seconds in between back-ups?
Replication
If you replicate data as it comes in, how granular is your replication? For instance, many virtualization vendors work on 4K blocks. If that corrupted 4K block spoils a Powerpoint presentation, then it’s a minor issue. Similarly, if it’s a voice recording does it matter if your recording is a bit noisy because you missed 250 m/s of data? But if it’s Blockchain, or SCADA data, then even a single Nibble can be a major problem.
Mirroring
I don’t like mirroring – if you corrupt the primary drive, you just corrupted the mirror ?
A Snapshot is NOT a Back-Up??
Taking a snapshot is not making a back-up? What?? Ok, imagine you have 20 TB of data and your “snapshot” took 5 seconds. Do you really think you made a back-up of 20 TB in 5 seconds? Let me explain what a snapshot is.?Data is stored in different places in storage. Those places will all have a location/addresses, which identifies where the data is stored. That data will also have some type of “Access Control List/ACL” and other Metadata. As a minimum a snapshot will be a record of the addresses/locations of where all the data is stored. i.e. no data, just the addresses/locations of where the data can be recovered from. Importantly you still need to back-up all that data again- somewhere. A good snapshot technology might include some other information such as ACL and Metadata. But to be clear, a snapshot is not a back-up.
Back-Up summary
A really complex area, and everyone will have a different solution. The data you have is far more important than your applications, e.g. re-building or expanding an application is pointless if your data is corrupted, or a hacker encrypted it with Ransomware etc.
Note. A hacker uses their hacker’s encryption key to put an encryption wrapper around your valuable data. So even although your data is encrypted and is “relatively safe – at least for some time” from the hacker being able to easily decrypt it, that encryption is no protection against Ransomware? The Ransomware hacker wraps all your data in their own encryption key. So, you cannot access it without their encryption key. The really nasty Ransomware hackers include some software to detect any attempts to break their key, and then their Ransomware automatically destroys the data. Those Ransomware hackers also do the same with all of your back-ups that they can find. That’s why using completely different physical back-ups is so valuable.
Databases
OK a really complex area. I am not a database expert. It’s also an area that the SQL and NoSQL people have very different opinions on. But I wanted to clear up one thing about SQL and NoSQL databases.
SQL and NoSQL
SQL stands for Structured Query Language. i.e. I can write a query to interrogate a database and that inquiry must have a structure to it. SQL was developed so that it was made up of several different tables, which had data in multiple tables which could be related to each other. i.e. Table 1 might have some data that was in some way related, or relevant to some contents in Table 2.
But in NoSQL, I also use a Structured Query Language to query my database, so why are they different?
NoSQL in it’s early days was referred to as Non-Relational SQL. They are also sometimes referred to as “Not only SQL”, the inference being they can do what SQL can do and more besides. Apparently, we should not use Wikipedia as it is not seen as a true Academic source, but I like Wikipedia articles, at least sometimes they can be a good starting point. See https://en.wikipedia.org/wiki/NoSQL
If you do any type of web-based search of Google etc, and search for SQL vs NoSQL, or NoSQL vs SQL, you will find strong arguments from supporters of one or the other. But here is something that I have found often;
SQL databases are vertically scalable while NoSQL databases are horizontally scalable. SQL databases have a predefined schema whereas NoSQL databases use dynamic schema for unstructured data.
I personally have used MongoDB and now DynamoDB (Amazon specific NoSQL Database), both are NoSQL. I have not really used any SQL databases. I just use DynamoDB, as it comes as part of Amazon/AWS and I find it easy to use.
SQL vs NoSQL for Speed !
This is the best explanation I have found, see?https://www.geeksforgeeks.org/sql-vs-nosql-which-one-is-better-to-use/
SQL databases are normalized databases where the data is broken down into various logical tables to avoid data redundancy and data duplication. In this scenario, SQL databases are faster than their NoSQL counterparts for joins, queries, updates, etc.?
On the other hand, NoSQL databases are specifically designed for unstructured data which can be document-oriented, column-oriented, graph-based, etc. In this case, a particular data entity is stored together and not partitioned. So performing read or write operations on a single data entity is faster for NoSQL databases as compared to SQL databases.
Above came from a website called “Geeks for Geeks”. https://www.geeksforgeeks.org
The Application Language
For ultra-low latency applications consider using hardware to do some processing?
Parallel Programming using VHDL?
But that’s hardware programming, it’s Logic Gates? Yes. If you want an ultra-fast application use an ASIC. If you have to change the program from time to time, use an FPGA.
The fastest financial systems use ASICS and FPGA’s. There are “languages” that help you design these I.C’s, see VHDL and that can be considered a parallel programming language, see https://en.wikipedia.org/wiki/VHDL ?
C++
It seems that C++ is one of the fastest general purpose languages.
I tried my trusty Wikipedia but I was not comfortable with what I read. I searched and searched and found two good, but very, very “deep” treasure troves of detailed technical information.
University of Cambridge - once you go to this first url, follow the many other links, if you really want to learn more. The second link is one that I followed and has over 200 pages of explanation.
Developers – pay for the best – they are the cheapest !
I would always pay twice the day rate/salary for a brilliant developer than what you pay for an average coder? i.e. brilliant coders are the route to finding the lowest latency applications.?(but read on – throwing money at them is not the answer to good coding)
Future Proofing your Code
Make sure your developer puts lots of detailed comments into the code. Then get that checked by someone else. Or you end up having some pure code that only that nice developer knows how it works and then you are very dependant on that developer. Not a good place to be ?
Right from the start get your coder to comment their code. Make sure they include references e.g. //ref 1234 then they explain the comment.
Then get an external resource/web package, to document and record everything.
This may seem paranoia but one of the problems I have seen with really good coders is they love problem solving, some are even hackers (nice hackers), and few can be bothered documenting very much. Part of their fun, is coming back six months later and busting their brain, trying to remember what they did, to get this to work in the first place.
That’s where the average coder can be of value. You have the “excellent” coder(s) doing the complex development work and those better at support/documentation doing the documentation/testing/recording of code/API’s etc.
The Five Button Mouse or Font Library problems?
A few things to think about?
What are the sources of data input and where is the data, or results going? I worked on projects where the core application had never “seen” a five-button mouse and another where an application was moved from one part of the world to another, and that application server experienced different character sets – or a different font library. In all those situations the outputs were problematical. You might think why were these not planned for? Good point. What you can do is design your program so it limits what can be input.
So, in Amazon/AWS for instance, I work on IoT solutions. I limit what any connected device can do. I can also limit what data/payload will be considered as valid. I can do that at the message broker (in AWS), and I can also do that at the IoT gateway end, so that before anything even gets to the AWS Cloud, I have already done one review and cleanse.
Making the Core Application Faster by Tuning the Input(s)
If we send every possible piece of data from all possible inputs into the core application, in real time, then we are starting our own Denial of Service or Distributed Denial of Service Attack. Why do people do that??(because it delivers a quick solution ! ).
Yes, you probably want everything you can possibly receive, and you want to store it for ever, and you want every possible log-file, and again you want those stored for ever. You also want incremental back-ups, every single minute and they must all be stored forever. Maybe that’s just too good? It’s just way too much of everything?
I suggest a different approach, but one you can change. Accept you need to compromise.
Use your edge or inputs to make some basic decisions about what you need to be sent immediately if possible.
That edge needs to have some intelligence. It will route real time data or alerts immediately to the central application with the appropriate QoS.
Now you are routing less traffic. Your core network is less busy, so latency is reduced. Your core application is executing fewer processes, so it is performing faster. Your databases and dashboards are also performing better.
Look at the Logs !
See above first, if you just jumped here?????
Ok, all seems good. Now start to look at the logs. See what is failing? People don’t realise they have problems, because they don’t look at the logs!
Now we are in “tuning mode”. This is the 80% rule. Go for 80% improvement based on what your logs tell you. Trying for that last remaining 20% is just not worthwhile, and by the time you get close, the goal-posts will have moved.
Write your own Audit Trail
See above.
Logs only work if someone wrote a program to watch and record something??People say there is nothing wrong, because the logs are clear? Really the situation is; “we have no idea if anything might be wrong because we don’t know how to measure that type of situation”.?
Or we thought the project as finished, and “signed-off” ? It’s never finished? Always be reviewing your apps, because the hackers never stop, they are always looking at your code. If you stop, then you just gave up being serious about security!
A CISO might like these next points?
As part of the development program, you are going to create an audit trail, and log events that right now we don’t know anything about? That helps to identify security holes or activities that could be of value in terms of security.
You are going to comment the code, on every line and use a lower skilled developer to document it and see if that lower skilled developer can make amendments? That makes sure future development of the application is secure.
Instead of allowing any payload, you are going to use an “allowed list” of payload entries at all data entry points, even before the payload gets into the network. I am not talking about known signatures. I am talking about real values. If you are measuring a value, then set the upper and lower limits to known maximum and minimum values. Do that for everything. That helps stop DoS and DDoS attacks.
The End