Total Pageviews

Tuesday, June 11, 2013

Download all pdf links from ASCE search page


- Search the articles by key word
- Get a web page that list all the pdf links
- Inspect the pdf link element, looks like this:
    <a class="ref nowrap" target="_blank" title="Opens new window" href="/doi/pdf/10.1061/%28ASCE%290733-9488%282001%29127%3A3%28118%29">PDF (1569 KB)</a>
- Open browser console, use jQuery select the pdf links based on its class/text/or other features
    pdfLinks = [];
    $('.ref:contains("PDF")').each( function() {  pdfLinks.push( $(this).attr( 'href' ); ) } ) ;
    pdfLinks.join( '\n' )
- Copy the output text from browser console, and save it to a text file list.txt.
- Use wget to get all the links:
    cat list.txt | xargs -I{} wget http://ascelibrary.org{}

No comments:

Post a Comment