How to save/download pdf embedded in web page without a pdf filename
The common method in CF for streaming a PDF to the browser is using this method:
<cfheader name="Content-Disposition" value="attachment;filename=#PDFFileName#">
<cfcontent type="application/pdf" reset="true" variable="#toBinary(PDFinMemory)#">
Use a C# WebRequest to get the URL of the PDf. Then check the response header for a 'Content-Type of 'application/pdf'. If so, save the binary stream to a PDF file on disk.
(HTML) Download a PDF file instead of opening them in browser when clicked
There is now the HTML 5 download
attribute that can handle this.
I agree, and think Sarim's answer is good (it probably should be the chosen answer if the OP ever returns). However, this answer is still the reliable way to handle it (as Yiğit Yener's answer points out and--oddly--people agree with). While the download attribute has gained support, it's still spotty:
http://caniuse.com/#feat=download
Python Download PDF Embedded in a Page
Using Selenium
with a specific ChromeProfile
you can download embedded pdfs using the following code:
Code:
def download_pdf(lnk):
from selenium import webdriver
from time import sleep
options = webdriver.ChromeOptions()
download_folder = "C:\\"
profile = {"plugins.plugins_list": [{"enabled": False,
"name": "Chrome PDF Viewer"}],
"download.default_directory": download_folder,
"download.extensions_to_open": "",
"plugins.always_open_pdf_externally": True}
options.add_experimental_option("prefs", profile)
print("Downloading file from link: {}".format(lnk))
driver = webdriver.Chrome(chrome_options = options)
driver.get(lnk)
filename = lnk.split("/")[4].split(".cfm")[0]
print("File: {}".format(filename))
print("Status: Download Complete.")
print("Folder: {}".format(download_folder))
driver.close()
And when I call this function:
download_pdf("http://www.equibase.com/premium/eqbPDFChartPlus.cfm?RACE=1&BorP=P&TID=ALB&CTRY=USA&DT=06/17/2002&DAY=D&STYLE=EQB")
Thats the output:
>>> Downloading file from link: http://www.equibase.com/premium/eqbPDFChartPlus.cfm?RACE=1&BorP=P&TID=ALB&CTRY=USA&DT=06/17/2002&DAY=D&STYLE=EQB
>>> File: eqbPDFChartPlus
>>> Status: Download Complete.
>>> Folder: C:\
>
Take a look at the specific profile:
profile = {"plugins.plugins_list": [{"enabled": False,
"name": "Chrome PDF Viewer"}],
"download.default_directory": download_folder,
"download.extensions_to_open": ""}
It disables the Chrome PDF Viewer
plugin (that embedds the pdf at the webpage), set the default download folder to the folder defined at download_folder
variable and sets that Chrome isn't allowed to open any extensions automatically.
After that, when you open the so called "Internal link" your webdriver will automatically download the .pdf
file to the download_folder
.
Set the default save as name for a an <embed> or <iframe> that uses a Blob
Note:
This answer is outdated.
The behavior described below did change since it was posted, and it may still change in the future.
Since this question has been asked elsewhere, with better responses, I invite you to read these instead: Can I set the filename of a PDF object displayed in Chrome?
I didn't find, yet, for chrome's default plugin.
I've got something that works for Firefox though, and which will default to download.pdf
in chrome, for some odd reason...
By passing a dataURI in the form of
'data:application/pdf;headers=filename%3D' + FILE_NAME + ';base64,...'
Firefox accepts FILE_NAME as the name of your file, but chrome doesn't...
A plnkr to show a better download.pdf
in chrome, which doesn't like nested iframes...
And an snippet which will only work in FF :
const FILE_NAME = 'myCoolFileName.pdf';
const file_header = ';headers=filename%3D';
fetch('https://dl.dropboxusercontent.com/s/rtktu1zwurgd43q/simplePDF.pdf?dl=0').then(r => r.blob())
.then(blob=>{
const f = new FileReader();
f.onload = () => myPdfViewer.src = f.result.replace(';', file_header + encodeURIComponent(FILE_NAME) + ';');
f.readAsDataURL(blob);
});
<iframe id="myPdfViewer" width="500" height="500"></iframe>
Related Topics
How to Reference Assemblies Using Visual Studio Code
Validate Drivers License Numbers
Updating an Object from a List in C#
How to Convert Any Date Format to Yyyy-Mm-Dd
C# Linq Using Null or Empty Strings in a Where Statment
Post Byte Array to Web API Server Using Httpclient
Regular Expression for Anything But an Empty String
Return Json, But It Includes Backward Slashes "\", Which I Don't Want
How to Call Another Controller Action from a Controller in MVC
Remove Hours:Seconds:Milliseconds in Datetime Object
How to Pass Parameter from @Url.Action to Controller Function
How to Read Request Body in an ASP.NET Core Webapi Controller
The Ssl Connection Could Not Be Established
How to Format a Number in C# With Commas and Decimals
Regex to Remove All Special Characters from String
Newtonsoft.Json Serializeobject Without Escape Backslashes
Most Efficient Way to Compare Two Ienumerables (Or Lists) in Linq