Prompt Title: Selenium web scraping in Java.

Question

0

Human: Take a Developer's persona and write a webscrapping script which traverses a webpage and get name, email, designation and phone number if available using selenium java

ChatGPT:

Sure, here's a sample script in Java using Selenium WebDriver that can scrape a webpage for name, email, designation, and phone number if available. This script assumes that the information is stored in HTML elements with specific IDs, classes, or other attributes that can be targeted using CSS selectors.

java
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

public class WebScraper {
  public static void main(String[] args) {
    // Set path to the ChromeDriver executable
    System.setProperty("webdriver.chrome.driver", "/path/to/chromedriver");

    // Create a new instance of the ChromeDriver
    WebDriver driver = new ChromeDriver();

    // Navigate to the webpage to be scraped
    driver.get("https://example.com");

    // Find the elements containing the name, email, designation, and phone number
    WebElement nameElement = driver.findElement(By.id("name"));
    WebElement emailElement = driver.findElement(By.id("email"));
    WebElement designationElement = driver.findElement(By.className("designation"));
    WebElement phoneElement = driver.findElement(By.cssSelector("div.contact-info span.phone"));

    // Extract the text from the elements
    String name = nameElement.getText();
    String email = emailElement.getText();
    String designation = designationElement.getText();
    String phone = phoneElement.getText();

    // Print the extracted information to the console
    System.out.println("Name: " + name);
    System.out.println("Email: " + email);
    System.out.println("Designation: " + designation);
    System.out.println("Phone: " + phone);

    // Close the browser
    driver.quit();
  }
}

This script uses the ChromeDriver executable to launch a new instance of the Chrome browser and navigate to the webpage to be scraped. It then uses various methods from the Selenium WebDriver API to find the HTML elements containing the desired information, extract the text from those elements, and print the results to the console. Finally, it closes the browser to clean up resources.

Of course, this script will need to be modified to suit the specific webpage and desired information to be scraped. It's also important to note that web scraping can be legally and ethically complex, so it's important to understand the laws and policies around web scraping before attempting to scrape any website.

Human: modify it to take the link from an excel sheet and write the output to a excelsheet

ChatGPT:

Certainly! Here's a modified version of the previous script that reads input links from an Excel sheet and writes the scraped data to an output Excel sheet.

java
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.*;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.Iterator;

public class WebScraper {
  public static void main(String[] args) {
    // Set path to the ChromeDriver executable
    System.setProperty("webdriver.chrome.driver", "/path/to/chromedriver");

    // Create a new instance of the ChromeDriver
    WebDriver driver = new ChromeDriver();

    // Load the input Excel file
    FileInputStream inputFile = new FileInputStream(new File("input.xlsx"));
    Workbook inputWorkbook = new XSSFWorkbook(inputFile);
    Sheet inputSheet = inputWorkbook.getSheetAt(0);

    // Create the output Excel file
    XSSFWorkbook outputWorkbook = new XSSFWorkbook();
    XSSFSheet outputSheet = outputWorkbook.createSheet("Output");

    // Loop over the rows in the input Excel sheet
    Iterator<Row> rowIterator = inputSheet.iterator();
    while (rowIterator.hasNext()) {
      Row row = rowIterator.next();

      // Read the link from the first column
      String link = row.getCell(0).getStringCellValue();

      // Navigate to the webpage to be scraped
      driver.get(link);

      // Find the elements containing the name, email, designation, and phone number
      WebElement nameElement = driver.findElement(By.id("name"));
      WebElement emailElement = driver.findElement(By.id("email"));
      WebElement designationElement = driver.findElement(By.className("designation"));
      WebElement phoneElement = driver.findElement(By.cssSelector("div.contact-info span.phone"));

      // Extract the text from the elements
      String name = nameElement.getText();
      String email = emailElement.getText();
      String designation = designationElement.getText();
      String phone = phoneElement.getText();

      // Create a new row in the output Excel sheet
      XSSFRow outputRow = outputSheet.createRow(row.getRowNum());

      // Write the extracted information to the output Excel sheet
      outputRow.createCell(0).setCellValue(name);
      outputRow.createCell(1).setCellValue(email);
      outputRow.createCell(2).setCellValue(designation);
      outputRow.createCell(3).setCellValue(phone);
    }

    // Save the output Excel file
    try {
      FileOutputStream outputStream = new FileOutputStream("output.xlsx");
      outputWorkbook.write(outputStream);
      outputWorkbook.close();
    } catch (Exception e) {
      e.printStackTrace();
    }

    // Close the browser
    driver.quit();
  }
}

This script uses the Apache POI library to read input links from an Excel sheet and write the scraped data to an output Excel sheet. It loops over the rows in the input sheet, reads the link from the first column, scrapes the desired information from the webpage, and writes the results to the output sheet. Finally, it saves the output Excel file and closes the browser.

Abhishek Vk

Shared 2 prompts

Created 2 years ago

Add a comment

Prompt Title: Selenium web scraping in Java.

Abhishek Vk

Leave a Comment

Related Tag Prompts

Catchy YouTube Video Titles.

Scraping Web Data with Java

Trending Prompts

Better content outlines

Effortless & Universal Midjourney Prompt Generator

Act as a Classical Music Composer

Trending Tags

Blogs

Top ChatGPT Prompts For Developers

Become A Writing Email Pro With The Most Helpful ChatGPT Prompts

30 Creative ChatGPT Prompts To Inspire Your Next Writing Project

Top ChatGPT Prompts For SEO: To Rank High On Search Engines

Make Money with ChatGPT: Unlock Your Earning Potential In 2023

Top ChatGPT Prompts for Beginners: Elevate Your Conversations with AI

Before Mastering ChatGPT: Learn This To Control It

MidJourney Prompts: Advanced Techniques for Breathtaking Visuals

Why ChatGPT For Students Is A Game-Changer In Education?

Finding Your Muse: Best MidJourney Prompts to Spark Your Imagination

ChatGPT For Search Engines

Prompt Title: Selenium web scraping in Java.

Share a link to this prompt

Leave a Comment

Related Tag Prompts

Trending Prompts

Trending Tags

Blogs

ChatGPT For Search Engines