Geek Gazette ????

Geek Gazette ????

???? Welcome to this week's Geek Gazette ???? Get ready to geek out with the latest tech news, tips, and trends. From groundbreaking innovations to three truths and a lie, we've got it all covered. Let's embark on this tech adventure together! ????


Tesla stock sinks again as Elon Musk says the EV maker will lay off 10% of its workforce??

Tesla is reducing its workforce by "more than 10%" globally after a disappointing first fiscal quarter where it significantly missed Wall Street's sales forecasts, according to a memo from CEO Elon Musk.

Rumors of layoffs at Tesla have circulated for months, particularly as the company reduced production in China and halted certain stock rewards. Reports suggest that up to 20% of Tesla's workforce may be affected by the planned layoffs, with discussions among employees on forums like Elektrek, Business Insider, and Blind.

Musk has informed employees that over 10% of Tesla's workforce, potentially around 14,000 out of 140,000 employees, will be laid off due to redundancies and overlapping job functions that arose as the company expanded, as stated in a leaked memo.


Meta’s X competitor Threads invites developers to sign up for API access, publishes docs????

Threads, Meta's rival to Twitter/X, has progressed in its developer API rollout. Following a phase of testing with select companies in March, Threads is now providing developer documentation and a sign-up sheet for interested parties. This move precedes the API's anticipated public launch in June.

The documentation outlines the API's current limitations and endpoints, facilitating developers in creating Threads-connected apps and integrating with the new social network.

Threads' developer API documentation outlines post analytics tracking, including views, likes, replies, reposts, and quotes. It also covers post and media publishing, reply retrieval, and troubleshooting. Account limitations include 250 API-published posts and 1,000 replies per day to prevent spam. Media specifications and a 500-character limit for text posts are provided.


The First AI-Generated Romcom Is Coming Out This Summer?????

TCL, a Chinese TV maker, has revealed "Next Stop Paris," seemingly the first AI-generated romantic comedy . However, the execution disappoints, evident in the chaotic 60-second trailer on YouTube.

The footage features distorted faces and jumbled scenery, presenting a generic meet-cute between two protagonists whose appearance constantly changes.

Kotaku highlighted that the AI struggled to accurately depict a clock tower, with Roman numerals appearing garbled. The trailer's narration, pondering on life's pace and love's timing, adds confusion to the already unclear plot. The chaotic trailer prompts speculation if TCL missed April Fools' Day, raising questions about who would desire such a film and if it's intended as a joke.


Cerebral to pay $7 million settlement in Facebook pixel data leak case????

The U.S. Federal Trade Commission has settled with telehealth company Cerebral, requiring them to pay $7 million due to allegations of mishandling sensitive health data. Cerebral offers online therapy and medication management for mental health conditions such as anxiety, depression, ADHD, Bipolar Disorder, and substance abuse.

In March 2023, Cerebral notified 3.2 million individuals who had interacted with its platforms about a data breach caused by tracking pixels. The FTC complaint accuses Cerebral and its former CEO, Kyle Robertson, of sharing consumers' personal health information with third parties for advertising purposes and violating cancellation policies.

The complaint alleges that Cerebral provided sensitive data to platforms like LinkedIn, Snapchat, and TikTok through tracking tools on its website or apps.


UK privacy watchdog to examine practice of web scraping to get training data for AI????

The UK's Information Commissioner's Office (ICO) is examining the legality of web scraping for gathering data to train generative AI models . This scrutiny was announced as part of the first consultation in a series concerning generative AI models, which generate text or images based on prompts after being trained on extensive datasets of similar media.

The process of gathering training data for AI models presents privacy challenges, primarily because it often involves automated collection at a large scale, raising concerns about the inadvertent collection of personal data. Research has revealed methods to extract training data from large language models (LLMs), which may inadvertently expose personal information.

The National Cyber Security Centre has cautioned about the threat of prompt injection attacks, which could potentially compromise the security of all AI tools by granting access to protected LLM data.

The ICO's consultations on web scraping primarily address data protection standards, rather than concerns regarding intellectual property or contract law infringement. According to the ICO, current practices suggest that five out of six lawful bases for processing data under British laws are unlikely to be applicable for training generative AI models with web-scraped data.

Under the UK GDPR, the only remaining lawful basis for training generative AI with web-scraped data is legitimate interests. However, this requires entities to assess the balance between individuals' data rights and the necessity of web scraping for AI training.

The ICO is seeking input from various stakeholders, including developers, users, legal advisors, civil society groups, and public bodies, to inform their positions on generative AI.


Discord Users Are Being Tracked Through Data-Scraping Site????

Discord has been known for its openness to bots and third-party tools . In contrast to messaging platforms like Instagram or Snapchat, Discord has a history of permitting the proliferation of bots and third-party tools on its platform. Among these, Spy.Pet stands out as a surveillance tool, tracking Discord users across the servers they've joined and offering detailed message logs for a starting price of approximately $5 worth of cryptocurrency.

404 Media conducted a trial of SpyPet's service and verified the authenticity of messages retrieved from Discord servers. SpyPet employs bots to gather data from servers and claims to be monitoring over 627 million Discord user accounts and more than 14,200 Discord servers, with a database comprising over 4 billion messages. Notably, Discord's official website states that it hosts 19 million active servers.

Despite the extent of SpyPet's operations, platforms like it raise significant concerns regarding user privacy and safety. Moreover, such activities directly violate Discord's Terms of Service, which explicitly prohibit the unauthorized scraping of Discord's data without the company's consent. Regrettably, tools like SpyPet have the potential to facilitate spying on Discord users, enabling stalkers, bullies, or malicious actors to harm individuals within the Discord community.

In response to inquiries from PCMag, a spokesperson from Discord emphasizes the company's commitment to safeguarding user privacy and data. They state that Discord is actively investigating the matter and will take appropriate action if violations of their Terms of Service and Community Guidelines are confirmed. Due to the ongoing nature of the investigation, further comments cannot be provided at this time.


Cisco Warns of Global Surge in Brute-Force Attacks Targeting VPN and SSH Services????

Cisco has issued a warning regarding a worldwide increase in brute-force attacks targeting a range of devices , such as VPN services, web application authentication interfaces, and SSH services. These attacks, observed since at least March 18, 2024, are believed to originate from TOR exit nodes and other anonymizing tunnels and proxies, according to Cisco Talos.

The cybersecurity company warns that successful brute-force attacks could lead to unauthorized network access, account lockouts, or denial-of-service conditions.

These broad and opportunistic attacks target various devices, including Cisco Secure Firewall VPN, Checkpoint VPN, Fortinet VPN, SonicWall VPN, RD Web Services, Mikrotik, Draytek, and Ubiquiti. Cisco Talos notes that the attacks use both generic and valid usernames for specific organizations, indiscriminately targeting a wide range of sectors across different regions.


Alleged cryptojacker accused of stealing $3.5M from cloud to mine under $1M in crypto????

Nebraska resident Charles O. Parks III faces charges for allegedly defrauding cloud service providers of over $3.5 million in a cryptojacking scheme . The indictment alleges that between January 2021 and August of the same year, Parks illicitly earned over $970,000 by utilizing the computational resources of two unnamed cloud computing companies referred to as "Company 1", "Subsidiary 1" (a subsidiary of Company 1), and "Company 2".

The cloud service providers allegedly defrauded by Charles O. Parks III, said to be based in Seattle and Redmond, remain unnamed. Parks is accused of creating five accounts with one of the providers' subsidiaries, utilizing a VPN and various identities including names, email addresses, and corporate affiliations.

Charles O. Parks III, allegedly using companies like CP3O LLC and ironically named MultiMillionaire LLC, persuaded cloud service providers to grant him increased access to powerful computing resources for cryptomining. However, he failed to pay the substantial bills incurred at each company.


Web scraping with Power Query: How does it work?????

What is Web Scraping?

Web scraping involves extracting data from web pages, allowing you to gather useful information and store it in your database. It's commonly used by companies for competitive analysis, comparing data from rival firms for comprehensive market research.

However, it's also employed to gather official information from government sites, retrieve statistics, or collect data from websites in related sectors. Regardless of the goal, web scraping can be done from any website, with tools like Power Query in Power BI making the process easier.

How do I use Power Query for Web Scraping?

To perform web scraping using Power Query and Power BI, follow these steps:

Extracting data:

  • Open Power BI and click on "Get data" in the Home tab.
  • Search for "web" and select the web connector.
  • Enter the URL from which you want to extract data and click "OK".
  • Authenticate if necessary.
  • Click "Connect" and navigate to the desired table.
  • Click "Transform data".

Web scraping simplifies data extraction from web pages, but the data often needs formatting to suit your database. You may need to transform the table and utilize web scraping with Power Query for this purpose.

Transforming data:

After clicking "Transform data," the Power Query editor opens for web scraping completion. It displays the selected table with its data. To tailor it to your analysis and reporting requirements, you can make various modifications using Power Query. These include renaming the table, deleting columns, renaming columns, and adding columns. Once the table is prepared, click "Close and apply" to view it in Power BI.

Add all the tables on the site

Web scraping with Power Query offers the advantage of extracting data from multiple pages on a website. To achieve this, you need to adjust the table code by following these steps:

  1. Use the advanced editor to access the code lines for the table.
  2. Locate the line: Source=Web.Browser.Content ("https:www.your-url.com/blog/page/2 "). This line only integrates tables from the current page.
  3. Modify the code to include all pages by adjusting it to: Source=Web.Browser.Content("https:www.your-url.com/blog/page/"&Page& ").
  4. The data table becomes a function, which you can rename to FxPages. Then create a list containing all page numbers.
  5. Click on "Call up a custom function" to utilize the FxPages function.
  6. A new column is added. Expand it to create a new table containing data from all pages.
  7. Transform the data as previously described.
  8. Finally, click "Close and apply."

Mastering Power Query with DataScientest

Power Query provides various database management features in addition to web scraping. However, mastering the query editor can be challenging. Training is crucial for leveraging these services effectively.


Web Scraping Python – Step by Step Guide?????

Have you ever felt on the verge of uncovering groundbreaking insights, only to be hindered by the vast expanse of data scattered across the web? Welcome to the world of web scraping with Python—a tool revered by data enthusiasts and professionals alike. This guide goes beyond surface-level explanations; it delves into the intricacies, providing a step-by-step tutorial on extracting web data with precision and efficiency.

Whether you're a novice eager to dive in or an expert seeking to sharpen your skills, you've come to the right place. Let's unlock the mysteries of web scraping with Python, turning complexity into simplicity.

What Is Web Scraping Python?

At its core, web scraping entails gathering data from the internet using programming methods. This encompasses various tasks, such as retrieving product prices, aggregating articles, or compiling contact databases. Python is highly favored for these tasks due to its user-friendly interface and robust library support. However, success in web scraping relies not only on data extraction but also on executing the process efficiently while adhering to the implicit rules of the web.

Building a Web Scraper: Python Prepwork

When embarking on web scraping with Python, the initial step involves preparing your environment, ensuring Python is installed on your computer—preferably Python 3.x for its latest features and enhancements. This setup establishes the groundwork for tackling the tasks and challenges encountered in web scraping. Following setup, it's crucial to acquaint yourself with key Python libraries essential for web scraping.

Tools like BeautifulSoup and Scrapy are integral to a scraper's toolkit. BeautifulSoup, known for its user-friendly approach, is ideal for beginners, while Scrapy is tailored for advanced scraping tasks, offering a robust framework for scraping projects. This phase entails selecting the appropriate tools and understanding their capabilities, which significantly influence the efficiency and success of your web scraping endeavors.

Getting to the Libraries

Delving deeper into web scraping with Python, the choice of libraries becomes pivotal, serving as indispensable guides through the intricacies of HTML and JavaScript found in modern websites. In this domain, two prominent players emerge: BeautifulSoup and Scrapy. BeautifulSoup, renowned for its user-friendly interface, is ideal for beginners, simplifying HTML document parsing and enabling easy data extraction from straightforward websites.

Meanwhile, Scrapy offers a robust framework tailored for large-scale scraping tasks, excelling in handling link navigation and managing requests seamlessly for complex projects. For websites heavily reliant on JavaScript, tools like Selenium prove invaluable, simulating user interactions to access data not directly available through HTML parsing.

WebDrivers and Browsers

When starting a web scraping project in Python, understanding the role of WebDrivers and browsers is crucial for interacting with web pages effectively. WebDrivers act as bridges between your code and dynamic web content, allowing automated control over browsers to perform tasks like navigation and form filling.

Selenium WebDriver is a powerful tool in this realm, facilitating automated interactions with web applications across various browsers. By configuring the browser driver and specifying the desired browser, such as Chrome or Firefox, your Python scripts gain enhanced capabilities for data extraction and complex scraping operations beyond basic HTML analysis.

Read the full article here and find everything about scraping with Python!


Deceptive Revelations: Three Truths and a Lie ??

  1. Apple says it was ordered to pull WhatsApp from China App Store
  2. Cisco creates architecture to improve security and sell you new switches
  3. Malicious Google Ads pushing fake IP scanner software with hidden backdoor
  4. Fashion giant Shein has been slapped with yet another hand : ‘It’s somewhat shocking that they’ve been able to get away with it'


Answer: Well, you see, Shein wasn't just slapped with any hand; they were slapped with a lawsuit! ????Turns out, the hand that did the slapping was too busy shopping for discounted clothes on Shein's website. It couldn't resist those irresistible deals! So, instead of a high-five, Shein got a legal notice. Moral of the story: Always watch where you slap, or you might end up in court! ??


Stay tuned for more innovations, insights, and exciting updates. The future holds limitless possibilities, and we're thrilled to explore them with you. Until next time, stay curious, stay tech-savvy, and we'll catch you in the next edition! ????

Want to gather data without breaking a sweat? Jump on board with our proxy solutions and let's make data collection a breeze!

No boring stuff here – just tech with a side of swagger! ????

要查看或添加评论,请登录

社区洞察

其他会员也浏览了