20 Free Don’t Miss Out Big Data Books In This Year 2016 !

20 Free Don’t Miss Out Big Data Books In This Year 2016 !

How many of us would agree or disagree to the below given statement; please comment.

Big Data != Hadoop != NoSQL != NewSQL ?

In this blog we have listed the best free big data books which we should not miss out in this year 2016, and looking more in the coming new year 2017.

The Big Data Transformation

This book is written by Alice LaPlante and it focus on massively parallel processing (MPP) analytical databases that enable us to run queries and dashboards on a variety of business metrics at extreme speed and Exabyte scale. And by reading this book we would be getting answers for the questions; How one prominent data storage company convinced both business and tech stakeholders to adopt an MPP analytical database, Why performance marketing technology company Criteo used a Center of Excellence (CoE) model to ensure the success of its big data analytics endeavors, How YPSM uses Vertica to speed up its Hadoop-based data processing environment, Why Cerner adopted an analytical database to scale its highly successful health information technology platform, and How Etsy drives success with the company’s big data initiative by avoiding common technical and organizational mistakes ?

Effective Multi-Tenant Distributed Systems

This books is written by Chad Carson led teams at Microsoft, Yahoo, and Inktomi, using huge amounts of data building web-scale products, including social search at Bing and sponsored search ranking and optimization at Yahoo. By reading this book we would be getting answers for the listed questions; How Hadoop and other multi-tenant distributed systems work, and why performance matters, Business-visible symptoms of performance problems: late jobs, inconsistent runtimes, and underutilized hardware, Scheduling challenges in multi-tenant systems, Symptoms and solutions for CPU performance limitations, Physical and virtual limits of node memory—and what happens when we run out, Identifying and solving performance problems due to disk and network performance limits and other typical bottlenecks, and Solutions for monitoring performance and accurately allocating cluster costs among users and business units.

Fast Data Architectures for Streaming Applications

This book is written by Dean Wampler, Ph.D. is the Architect for Big Data Products and Services and a member of the office of the CTO at Lightbend (formerly Typesafe). He leads Lightbend's technical architecture for Fast Data using Spark, Kafka, Mesos, Akka and other tools. This book focus on the Batch-mode processing isn’t going away, but exclusive use of these systems is now a competitive disadvantage. And by reading this book we would be able to get insights about Learn step-by-step how a basic fast data architecture works, Understand why event logs are the core abstraction for streaming architectures, while message queues are the core integration tool, Use methods for analyzing infinite data sets, where we don’t have all the data and never will, Take a tour of open source streaming engines, and discover which ones work best for different use cases, Get recommendations for making real-world streaming systems responsive, resilient, elastic, and message driven, and Explore an example streaming application for the IoT: telemetry ingestion and anomaly detection for home automation systems.

The Global Impact of Open Data

This book has been written by Andrew Young is the Associate Director of Research at The GovLab (www.thegovlab.org), where he leads a number of grant-funded research efforts focusing on the impact of technology on public institutions. And Stefaan G. Verhulst is the Co-Founder and Chief R&D Officer of The GovLab at New York University’s Tandon School of Engineering, responsible for experimentation and evidence gathering on how to transform governance by using advances in science and technology. By reading this book we would be getting insights of the below pointers Recommendations and implementation steps for policymakers, entrepreneurs, and activists looking to leverage open data, Key challenges, such as resource shortages and inadequate privacy or security protections, Four conditions that enable open data to work—including organizational partnerships and collaborations, Case studies of open data projects for improving government in Brazil, Sweden, Slovakia, and other countries, Projects for empowering citizens in Tanzania, Kenya, Mexico, and Uruguay, New business opportunities enabled by open weather, geo-location, and market research data, and Public problem-solving efforts built on open data for Ebola in Sierra Leone, dengue fever in Singapore, and earthquakes in New Zealand.

Data and Democracy

This book has been written by Andrew Therriault was the Democratic National Committee's Director of Data Science from 2014 to 2016, leading a team which developed voter targeting models and other analytic tools used by thousands of Democratic campaigns. By reading this book we can able to get informations like The Role of Data in Campaigns, Essentials of Modeling and Microtargeting, Data Management for Political Campaigns, How Technology Is Changing the Polling Industry, Data-Driven Media Optimization, How (and Why) to Follow the Money in Politics, Digital Advertising in the Post-Obama Era, and Election Forecasting in the Media.

Architecting for Access

This book is written by Rich Morrow and he is a 20 year veteran of IT, and an expert big data technologies like Hadoop. He has been teaching Cloudera (Hadoop) and AWS for nearly 3 years, retains all certifications for both, and uses these technologies in his day to day consulting practice. Fragmented, disparate backend data systems have become the norm in today’s enterprise, where you’ll find a mix of relational databases, Hadoop stores, and NoSQL engines, with access and analytics tools bolted on every which way. This mishmash of options presents a real challenge when it comes to choosing frontend analytics and visualization tools. And by reading this book we can get the below answers Understand why and how data became so fractured so quickly, Explore the tangled web of data and backend tools in today’s enterprises, Learn the tool requirements for accessing and analyzing the full spectrum of data, Examine the relative strengths of popular analytics and visualization tools, including Looker, Tableau, and MicroStrategy and Inspect Looker’s unique focus on both the frontend and backend.

Getting Data Right

By reading this book we can able to learn about the fundamental challenges that Data Variety poses to enterprises looking to maximize the value of their existing investments and how new approaches promise to help organizations embrace and leverage the fundamental diversity of data. Readers will also find best practices for designing bottom-up and probabilistic methods for finding and managing data; principles for doing data science at scale in the big data era; preparing and unifying data in ways that complement existing systems; optimizing data warehousing; and how to use “data ops” to automate large-scale integration. And this book is written by team of Jerry Held, Michael Stonebraker, Thomas H. Davenport, Ihab Ilyas, Michael L. Brodie, Andy Palmer & James Markarian.

Hadoop and Spark Performance for the Enterprise

The above said book is written by Andy Oram is an editor at O'Reilly Media, a highly respected book publisher and technology information provider. An employee of the company since 1992, Andy currently specializes in open source, software engineering, and health IT, but his editorial output has ranged from a legal guide covering intellectual property to a graphic novel about teenage hackers. And by reading this book we can get the answers for Multiple users contending for resources, such as those on operating systems, Jobs that grow or shrink in hardware usage, so they don’t strain at resource limits or let resources go to waste, Jobs of different priorities, including soft real-time requirements that allow them to override lower-priority or adhoc jobs, and Performance guarantees, similar to service-level agreements (SLAs).

In Search of Database Nirvana

The database pendulum is in full swing. Ten years ago, web-scale companies began moving away from proprietary relational databases to handle big data use cases with NoSQL and Hadoop. Now, for a variety of reasons, the pendulum is swinging back toward SQL-based solutions. What many companies really want is a system that can handle all of their operational, OLTP, BI, and analytic workloads. Could such an all-in-one database exist? And by reading by this book we can get all the below answers The challenges of having one query engine support operational, BI, and analytical workloads, Efforts to produce a query engine that supports multiple storage engines, Attempts to support multiple data models with the same query engine, Why an HTAP database engine needs to provide enterprise-caliber capabilities, including high availability, security, and manageability, and How to assess various options for meeting workload requirements with one database engine, or a combination of query and storage engines.

The Big Data Market

The book is written by Aman Naimat is the SVP Technology for Demandbase where he is working on creating the first Artificial Intelligence account-based marketing platform. Aman was previously co-founder and CTO of Spiderbook, a data-driven sales engine. And by reading this book we can get details on the below The relatively small number of companies using big data in production, Industries that have embraced big data the most—and the least, The amount of money spent on various big data use cases and How many companies actually use “fast data”.

Data Infrastructure for Next-Gen Finance

This book is written by Jane Roberts and he is an award-winning technical writer with over 25 years experience writing documentation, including training materials, marketing collateral, technical manuals, blogs, white papers, case studies, style guides, big data content, and web content. Jane is also a professional artist. And by reading this book we can get the below answers Learn how FINRA migrated their portfolio from a data warehouse to the Hadoop cloud ecosystem, Understand what’s required to support data governance in finance, and learn about the infrastructure Capital One implemented, Delve into Hadoop’s security maturity model, compliance-ready security controls, and enterprise data hub for preventing breaches and Examine the architecture of a Customer Event Hub, a tool that’s pushing the boundaries of how organizations interact with customers. And this report examines the tools and best practices that leading financial firms are using to migrate data to the cloud, build customer event hubs, and adhere to new rules for governance and security.

Analysing Data in the Internet of Things

The Internet of Things (IoT) is growing fast. According to the International Data Corporation (IDC), more than 28 billion things will be connected to the Internet by 2020—from smartwatches and other wearables to smart cities, smart homes, and smart cars. And this book is written by Alice LaPlante and he is award-winning writer who has been writing about technology, and the business of technology, for more than 20 years. And by reading this book we can able to get answer for the below pointers Using Spark Streaming for proactive maintenance and accident prevention in railway equipment, Monitoring subway and expressway traffic in Singapore using telco data, Managing emergency vehicles through situation awareness of traffic and weather in the smart city pilot in Oulu, Finland, Capturing and routing device-based health data to reduce cardiovascular disease and Using data analytics to reduce human space flight risk in NASA’s Orion program.

Making Sense of Stream Processing

This book is written by Martin Kleppmann and he is a researcher in distributed systems at the University of Cambridge. Previously he was a software engineer and entrepreneur at Internet companies including LinkedIn and Rapportive, where he worked on large-scale data infrastructure. In the process he learned a few things the hard way, and he hopes this book will save us from repeating the same mistakes. And this book is based on the philosophy behind Apache Kafka and Scalable Stream Data Platforms. By having insights we can able to get insights from the below pointers Understand stream processing fundamentals and their similarities to event sourcing, CQRS, and complex event processing, Learn how logs can make search indexes and caches easier to maintain explore the integration of databases with event streams, using the new Bottled Water open source tool, and Turn our database architecture inside out by orienting it around streams and materialized views.

The Hadoop Performance Myth

The Hadoop Performance Myth book helps to get answer for Why Best Practices Lead to Underutilized Clusters, and Which New Tools Can Help. And this book is written by Courtney Webster is a reformed chemist in the Washington, D.C. metro area. She spent a few years after grad school programming robots to do chemistry and is now managing web and mobile applications for clinical research trials. In this book she examines the root cause of these performance problems and explains why best practices for mitigating them—cluster tuning, provisioning, and even cluster isolation for mission critical jobs—don’t provide viable, scalable, or long-term solutions.

Architecting Data Lakes

This book Architecting Data Lakes will help us to have Data Management Architectures for Advanced Business Use Cases. And this book is written by Alice LaPlante is an award-winning writer who has been writing about technology, and the business of technology, for more than 20 years and co-authored by Ben Sharma, is CEO and co-founder of Zaloni. He is a passionate technologist with experience in business development, solutions architecture, and service delivery of big data, analytics and enterprise infrastructure solutions. By this book we'll learn The key attributes of a data lake, including its ability to store information in native formats for later processing, Why implementing data management and governance in your data lake is crucial, How to address various challenges for building and managing a data lake, Self-service options that enable different users to access the data lake without help from IT, and Emerging trends that will shape the future of data lakes.

Ten Signs of Data Science Maturity

This book called Ten Signs of Data Science Maturity, is being written by Peter Guerra is a Vice President in Booz Allen Hamilton’s Strategic Innovation Group co-leading the Data Science team and Dr. Kirk Borne is the Principal Data Scientist at Booz Allen Hamilton (since 2015). He supports the Strategic Innovation Group in the area of NextGen Analytics and Data Science. And this report provides a detailed discussion of each of the 10 signs of data science maturity, which—among many other things—encourage we to Give members of our organization access to all our available data, Use Agile and leverage "DataOps"—DevOps for data product development, Help your data science team sharpen its skills through open or internal competitions and Personify data science as a way of doing things, and not a thing to do.

Data and Electric Power

This Data and Electric Power, talks about From Deterministic Machines to Probabilistic Systems in Traditional Engineering. In this O’Reilly report, Sean Patrick Murphy, Chief Data Scientist at PingThings, describes how data science is helping electric utilities make sense of a stochastic world filled with increasing uncertainty—including fundamental changes to the energy market and random phenomena such as weather and solar activity. Murphy also reviews several cutting-edge tools for storing and processing big data that he’s used in his work with electric utilities—tools that can help traditional engineers pursue a data-driven approach in many industries. This books include Key drivers that have changed the electric grid from a deterministic machine into probabilistic system, Fundamental differences that put traditional engineering and data science at odds with one another, Why the time is right for engineering organizations to adopt a complete data-driven approach Contemporary tools that traditional engineers can use to store and process big data, and A PingThings case study for dealing with random geomagnetic disturbances to the energy grid.

Hadoop: What You Need to Know

Hadoop: What You Need to Know, is report which will help us with Hadoop Basics for the Enterprise Decision Maker. Hadoop represents a major shift from traditional enterprise data warehousing and data analytics, and its technology can be daunting at first. Donald Miner, founder of the data science firm Miner & Kasch, covers just enough ground so we can make intelligent decisions about Hadoop in our enterprise. By the end of this report, we will know the basics of technologies such as HDFS, MapReduce, and YARN, without becoming mired in the details. Not only will we learn the basics of how Hadoop works and why it’s such an important technology, we will get examples of how we should probably be using it.

Data, Technology, and the Future of Play

Data, Technology, and the Future of Play is the book , which will help us to understanding the Smart Toy and Big Data Landscape. Toys are becoming increasingly smarter. Once merely objects of play, today’s toys often act as agents of play, guiding kids toward learning through interactivity and feedback. As this book report explains, smart toys not only employ sophisticated algorithms, but also share data and get updates via the cloud. What are the implications of a toy that, instead of fostering open-ended play, now becomes the playmate? And this book explains and shares the answer of the below three feedback loops that guide the behavior of a smart toy over its lifetime, privacy concerns about a smart toy’s ability to "converse" with children by collecting and storing conversations, the risk of children becoming socially withdrawn and addicted to technology due to increased use of smart toys, benefits of smart toys, including the ability of the machines to learn from users and provide customized education and predictions for how data and technology will change the nature of play and toys—including connected play and immersive environments. This book is written by Meghan Athavale, and she is an entrepreneur, artist, visual designer and musician. She spent her childhood in northern Canada running through forests, fishing, swimming, and climbing trees.

Big Data Now

Big Data Now is the series of book, and this time it is in fifth in row. This annual Big Data Now report recaps the trends, tools, applications, and forecasts we’ve talked about over the past year. And brings insights on data-driven cultures, data science, data pipelines, big data architecture and infrastructure, the Internet of Things and real time, applications of big data, and security, ethics, and governance. And this book is written by O'Reilly Media to have current prespective for big data now. 

Help spread the word!…

要查看或添加评论,请登录

Prateek Mehta的更多文章

社区洞察

其他会员也浏览了