?Did you ever wonder!! people saying hundreds of TeraBytes & much more Data roaming around the #World?Here is where you have to look once..!!??ML Role
Pradeep K.
Strategic Operations Leader | Proactive Negotiator | Team Dynamics Expert | Applying Computer Science for Seamless Client Relationships and Actionable Leadership
Just Watch it_if you have less_time
But if interested in physics & bored by the way you learnt then must..watch
till END...
Let me introduce you to_CERN i.e
(JOURNEY, for the quest to find GOD and the first element created in the Universe)
Introduction of CERN:
The European Organization for Nuclear Research known as CERN.
Do you know CERN is also the birthplace of the World Wide Web then you must also know who created it??
- Tim Berners-Lee, a British scientist, invented the World Wide Web (WWW) in 1989, while working at CERN. The web was originally conceived and developed to meet the demand for automated information-sharing between scientists in universities and institutes around the world.
- a European research organization that operates the largest particle physics laboratory in the world. Established in 1954, the organization is based in a northwest suburb of Geneva on the Franco–Swiss border and has 23 member states.[3] Israel is the only non-European country granted full membership.[4] CERN is an official United Nations Observer.[5]
- CERN's main function is to provide the particle accelerators and other infrastructure needed for high-energy physics research – as a result, numerous experiments have been constructed at CERN through international collaborations.
- The main site at Meyrin hosts a large computing facility, which is primarily used to store and analyse data from experiments, as well as simulate events. Researchers need remote access to these facilities, so the lab has historically been a major wide area network hub.
Interesting - fact - 1:
Over 600 million collison occur each sec and one in million collision is of INTEREST
Interesting - fact - 2:
There is 100 GB per sec data generated and stored around the World as Tier-O & Tier-1(11 Permanant Countries)
Distributed network around the world analzing the data of the LHC collision Reports across various top-most universities and researches are going-on still till now Higgs-Boson God-Partical is found and rest is yet to Reveal.
Brief Over-all CERN does:
The LHC produces 600 million collisions every second in each detector, which generates approximately one petabyte of data per second. None of today’s computing systems are capable of recording such rates. Hence sophisticated selection systems are used for a first fast electronic pre-selection, only passing one out of 10 000 events. Tens of thousands of processor cores then select 1% of the remaining events. Even after such a drastic data reduction, the four big experiments, ALICE, ATLAS, CMS and LHCb, together need to store over 25 petabytes per year. The LHC data are aggregated in the CERN Data Centre, where initial data reconstruction is performed, and a copy is archived to long-term tape storage. Another copy is sent to several large scale data centres around the world. Subsequently hundreds of thousands of computers from around the world come into action: harnessed in a distributed computing service, they form the Worldwide LHC Computing Grid (WLCG), which provides the resources to store, distribute, and process the LHC data. WLCG combines the power of more than 170 collaborating centres in 36 countries around the world, which are linked to CERN. Every day WLCG processes more than 1.5 million ‘jobs’, corresponding to a single computer running for more than 600 years.
USECASES:
- Include the NASA Deep-Space Explorations i am a space-enthusiast i also would like to mention the Deep-Space Explorations patterns and Prediction of other Space missions and External Space Affects like Solar-flakes, to lunar winds and much more that has affect on the Mother-Earth.
WHERE Does CERN Use Machine Learning? ?????
- With about one billion proton–proton collisions per second at the Large Hadron Collider (LHC), the LHC experiments need to quickly choose which collisions to analyse.
Because they send a billion of Protons and they circle up to hit them exactly leaving them behind the 1 out billions of collisions and major issue is during their 27km circle they hit other protons side-ways which des not give enough amount data to explore So they want to implement machine Learning to Reduce the Probabilty of Billions of protons colliding on the side-ways.
Till now this collision parameter checking on the Human-interaction basis but Its is a Job to search for the Data and classify out of each data generated every second 25GB per second analyzing it and again checking it out for the next few hours days and even months of this continous process thus makes use of the machines to do it more quicker way by the use of self-programmed FPGAs that functioning on the Machine Learning algorithm shortened by the CERN Experts depending on the capabilities of the Hardware they are using algorithm that need to be classified for that work and started using.
- To cope up with even higher number of collisions per second in the future, scientists are investigating computing methods such as machine-learning techniques. A new collaboration is now looking at how these techniques deployed on chips known as field-programmable gate arrays (FPGAs) could apply to autonomous driving, so that the fast decision-making used for particle collisions could help prevent collisions on the road.
What are FPGAs ? from scratch
- Circuit diagrams were previously used to specify the configuration, but this is increasingly rare due to the advent of electronic design automation tools. So an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence the term "field-programmable". came into picture
FPGAs have been used at CERN for many years and for many applications. Unlike the central processing unit of a laptop, these chips follow simple instructions and process many parallel tasks at once.
The challenge, however, has been to fit complex deep-learning algorithms—a particular class of machine-learning algorithms—in chips of limited capacity. This required software developed for the CERN-based experiments, called "hls4ml," which reduces the algorithms and produces FPGA-ready code without loss of accuracy or performance, allowing the chips to execute decision-making algorithms in micro-seconds.
- plans to use the techniques and software developed for the experiments at CERN to research their use in deploying deep learning on FPGAs, a particular class of machine-learning algorithms, for autonomous driving. Instead of particle-physics data, the FPGAs will be used to interpret huge quantities of data generated by normal driving conditions, using readouts from car sensors to identify pedestrians and vehicles.
The technology should enable automated drive cars to make faster and better decisions and predictions, thus avoiding traffic collisions.
Addressing Computing Challenges:
At CERN, they prepare the HEP models designed and optimized for specific tasks
- Generally Custom Models
- Fewer weights and operations than out-of-line models
- Higher Accuracy in Results like Collisions
Depending on the task, we need:
- Fast inference
- Online Training Capability
- Fast Training for Large optimizations
Reperforming a Nobel-Prize Discovery on Kubernetes and the Google Cloud
Analyzing data in 90 Sec running included :
- 1200 nodes of about 104 GB RAM and 20,000 cores i.e 120 TB RAM on kubernets, Done the complete process of DATA STAGE-In and next to PROCESS took about 90 sec getting a nobel prize in this domain by software.
Issues that are taken while ML Model generation:
Collimator alignment campaigns involve continuously moving the jaws towards the beam, whilst ignoring any non-alignment spikes, until a clear alignment spike is observed. An alignment spike, indicates that the moving jaw touched the beam halo and is hence in contact with the primary beam.
It consists of a steady-state signal before the spike (corresponding to movements of the jaws before the beam is reached), the loss spike itself, the temporal decay of losses, and a steady-state signal after the spike. This second steady-state, with larger losses than the first one, is a result of the continuous scraping of halo particles when the jaw positions are fixed. The further a jaw cuts into the beam halo the more the steady-state signal increases, as the density of the particles near the jaw increases.
Any other spikes which do not follow this pattern are classified as non-alignment spikes. They do not have a fixed structure and can contain spurious high spikes. Such non-alignment spikes arise due to other factors, i.e. beam instabilities or mechanical vibrations of the opposite jaw, thus indicating that the jaw has not yet touched the beam and must resume its alignment. In order to achieve a reliable alignment, one has to be able to correctly identify such alignment spikes. Note that in a single alignment campaign, hundreds of such spikes need to be analysed.
To fully-automate the BBA by automating the process of spike recognition, by casting it as a classification problem, such that ML models were trained to distinguish between the two spike patterns in the BLM losses. Data was gathered from 11 semi-automatic collimator alignment campaigns performed in 2016 and 2018, both at injection and at flat top. A total of 6446 samples were extracted, 4379 positive (alignment spikes) and 2067 negative (non-alignment spikes). The data logged during alignment campaigns consists of the 100 Hz BLM signals and the collimator jaw positions logged at a frequency of 1 Hz. The data extracted for the data set consists of the moments when each collimator jaw(s) stopped moving, when the losses exceeded the threshold defined for the semi-automatic alignment.
The resulting five most important features were:
- Height (1 feature) — This is calculated by subtracting the average steady state losses before the spike from the maximum value. The average steady state is calculated from the BLM signal after the decay of the previous alignment, until the current collimator was stopped.
- Spike decay (3 features) — Exponential fit to the decay in the BLM signal using ae?bx+c
- Position in sigma (1 feature) — A beam size invariant way of expressing the fraction of the normally distributed beam interrupted by the jaw, as the beam size in mm varies across locations in the accelerator.
CONCLUSION:
- The classification of vacuum gauges measurements to identify possible heating issues during LHC operation has been also considered as a candidate for ML applications. Promising results have been obtained and a Multi-Layer Perceptron has been shown to perform better in terms of recall score.
- Currently, more ML and deep learning approaches are under investigation to push further the performance of the classification algorithms. The ultimate goal is to develop an application to be deployed in routine operation during the LHC Run3.
- These results need further physical analyses to get more insight in the observed features, but represent a very useful and new tool.
Things you must also know from my END_(click here..)
- This is One Truth that we haven't seen.
- Lord Shiva the Creator and Destroyer Status is placed outside the HEAD-QUARTERS OF CERN.
- As said by einstein "Religion without science is Blind, And Religion with Science is COMPLETE."
- Including CERN’s curious choice of geographic location
- Gate-way to the Under-Ground World and many things to know...!!!!!!