01-15-2011, 02:53 PM
|
#1
|
Acerbic Cyberbully
Join Date: Aug 2003
Location: back in Chilliwack
|
File Compression Help
I need to email a fairly small text file (four A4 pages consisting of 2300 words, and three fonts) as a PDF, but when I go to convert it, it ends up as a 5–7 mB behemoth. I have done hundreds of conversions of text files before; some that include complex images and tables, and I have never encountered this problem in the past. When I converted my +300-page dissertation into a PDF, it resulted in a very reasonable 1.3 mB file. I have attempted to compress using my word processor, Adobe Acrobat, and Stuffit Deluxe, but the smallest I can get the thing is 5.5 mB. What is going on?
Any ideas why, or any thoughts about how to reduce this to a manageable size? I REALLY hate emailing such large items, so any help on this would be most appreciated.
|
|
|
01-15-2011, 03:58 PM
|
#2
|
Atomic Nerd
Join Date: Jul 2004
Location: Calgary
|
Do you you have the full version of Acrobat? Try doing this, print out the text and scan it back in with OCR and make a new PDF. It should be pretty small. I have no idea why it would be so big. How are you converting it? The issue is not with compression, it's with your conversion. You are doing something wrong there. Are you sure you aren't saving it as an image PDF somehow instead of text based one?
If all else fails and you want to compress your PDF, try WinRAR.
|
|
|
01-15-2011, 04:06 PM
|
#3
|
First Line Centre
Join Date: Mar 2009
Location: Brisbane, Australia
|
I'm wondering if it is turning it into an image in the PDF, rather than text. In the PDF, can you select text? Or is it like a picture? If so, there may be a setting in your exporter which is forcing it to an image rather than postscript. Or perhaps a font you're using isn't supported by PDF, so it's rasterizing the output. As a test, you could try changing the whole thing to Arial or Times New Roman, and see if it magically fixes it.
|
|
|
01-15-2011, 04:09 PM
|
#4
|
The new goggles also do nothing.
Join Date: Oct 2001
Location: Calgary
|
Is the original text file actually text? Or is it an image format of somekind?
IF you create a PDF of actual text, it only has to store the letters and some extra info, but if the source is an image, it can only put the entire image.
So like H&L suggests if it's an image original, you can do some OCR on it to convert it into actual text.
|
|
|
01-15-2011, 04:35 PM
|
#5
|
Acerbic Cyberbully
Join Date: Aug 2003
Location: back in Chilliwack
|
Thanks for your suggestions.
I am using a MacBook Pro 240 Ghz Intel Core i5, and am running Office for Mac 2008 and Acrobat 9 Pro (v. 9.4.1) respectively.
I have attempted the PDF conversion in the Word print dialogue box, using my home printer-driver as well as the Xerox printer at work, and as far as I can tell, both are set to grayscale text document settings. I have attempted the same conversion in Acrobat Pro directly, while selecting "smallest file size" in the "Adobe PDF Settings" dialogue box.
I have used the same fonts in literally dozens of other conversions, so I am fairly certain that that is not the issue. Is there a way to change the settings in the exporter to ensure that this is a text based PDF? I suspicious that H & L and MadMel are correct about the conversion settings.
|
|
|
01-15-2011, 04:36 PM
|
#6
|
The new goggles also do nothing.
Join Date: Oct 2001
Location: Calgary
|
Is the source document editable?
|
|
|
01-15-2011, 05:01 PM
|
#7
|
Acerbic Cyberbully
Join Date: Aug 2003
Location: back in Chilliwack
|
Quote:
Originally Posted by photon
Is the source document editable?
|
It is a standard Word document (.doc). I have always tried avoiding .docx files simply because I have a number of colleagues in Europe who do not have systems that can read these. It is editable, and I am in fact working on it as we speak. The text file itself is only 78 kB.
|
|
|
01-15-2011, 05:17 PM
|
#8
|
Acerbic Cyberbully
Join Date: Aug 2003
Location: back in Chilliwack
|
Quote:
Originally Posted by Mad Mel
I...As a test, you could try changing the whole thing to Arial or Times New Roman, and see if it magically fixes it.
|
I attempted this, turning the whole thing into Calibri and it served to make things even worse. My 5-page, single-font PDF is a whopping 10.5 mB!!
|
|
|
01-15-2011, 05:19 PM
|
#9
|
First Line Centre
Join Date: Mar 2009
Location: Brisbane, Australia
|
I have Office 2008 on my laptop (PC), so I just did a quick test. I copied the content of this thread, pasted it into Word, saved it (2008 format, docx), then saved to pdf (Save As PDF or XPS menu item, not printing to an Adobe driver). The Word doc was 112kb, the pdf was 399. It did contain some images (posters avatars), but the text did save to pdf as text. Is that ratio (about 4x the size) the same as you're getting?
|
|
|
01-15-2011, 05:22 PM
|
#10
|
Acerbic Cyberbully
Join Date: Aug 2003
Location: back in Chilliwack
|
Quote:
Originally Posted by Mad Mel
I have Office 2008 on my laptop (PC), so I just did a quick test. I copied the content of this thread, pasted it into Word, saved it (2008 format, docx), then saved to pdf (Save As PDF or XPS menu item, not printing to an Adobe driver). The Word doc was 112kb, the pdf was 399. It did contain some images (posters avatars), but the text did save to pdf as text. Is that ratio (about 4x the size) the same as you're getting?
|
That sounds about average for what I normally produce using the print dialogue box to create PDFs. For the document that I am experiencing problems with, the increase is MASSIVE. The doc itself is 78 kB (.doc), and the PDFs generated are consistently 5.5–7 mB.
|
|
|
01-15-2011, 05:23 PM
|
#11
|
First Line Centre
Join Date: Mar 2009
Location: Brisbane, Australia
|
I just looked at the "Save As PDF" options... there's a checkbox for "Bitmap fonts when text may not be embedded". If that is turned on, it could cause bloating.
|
|
|
01-15-2011, 05:45 PM
|
#12
|
#1 Goaltender
|
To be quite honest, I don't think 5 to 7 megs counts as large anymore. Any email system out there is going to handle attachments that size without complaint. I wouldn't waste much time on it at this point.
__________________
-Scott
|
|
|
01-15-2011, 09:18 PM
|
#13
|
Franchise Player
|
Quote:
Originally Posted by sclitheroe
To be quite honest, I don't think 5 to 7 megs counts as large anymore. Any email system out there is going to handle attachments that size without complaint. I wouldn't waste much time on it at this point.
|
While I agree that wouldn't be considered "large" you'd be shocked at how many big company networks kick back files that are larger than 5-10 MB.
__________________
|
|
|
01-15-2011, 11:14 PM
|
#14
|
Franchise Player
Join Date: Aug 2005
Location: Violating Copyrights
|
Yeah, it sounds like the fonts are being embeded.
|
|
|
01-16-2011, 09:23 AM
|
#15
|
#1 Goaltender
|
Quote:
Originally Posted by corporatejay
While I agree that wouldn't be considered "large" you'd be shocked at how many big company networks kick back files that are larger than 5-10 MB.
|
I support a lot of companies - I haven't seen one refuse a 5 meg attachment in a very long time. Exchange 2007 and 2010 even ship with a 10 meg limit out of the box as their default.
__________________
-Scott
|
|
|
The Following User Says Thank You to sclitheroe For This Useful Post:
|
|
01-18-2011, 03:16 PM
|
#16
|
Atomic Nerd
Join Date: Jul 2004
Location: Calgary
|
I still think it's either saving it as an image (ie: it mentions greyscale text mode) or it is embedding extra information that is uneccessary. It may be because you are doing it from the print menu instead of the save as menu. When I used to do this at work, it would default to imaging mode when dealing with the network printer instead of text mode.
If you are comfortable with it, perhaps use one of the online free PDF conversion services and see what they return. I'm guessing they will be more concerned with bandwidth and will automatically default to the best text format and compression setting.
http://www.google.ca/search?hl=en&so...-s1g1&aql=&oq=
Last edited by Hack&Lube; 01-18-2011 at 03:30 PM.
|
|
|
01-18-2011, 03:24 PM
|
#17
|
Redundant Minister of Redundancy
Join Date: Apr 2004
Location: Montreal
|
Quote:
Originally Posted by Textcritic
Thanks for your suggestions.
I am using a MacBook Pro 240 Ghz Intel Core i5, and am running Office for Mac 2008 and Acrobat 9 Pro (v. 9.4.1) respectively.
I have attempted the PDF conversion in the Word print dialogue box, using my home printer-driver as well as the Xerox printer at work, and as far as I can tell, both are set to grayscale text document settings. I have attempted the same conversion in Acrobat Pro directly, while selecting "smallest file size" in the "Adobe PDF Settings" dialogue box.
I have used the same fonts in literally dozens of other conversions, so I am fairly certain that that is not the issue. Is there a way to change the settings in the exporter to ensure that this is a text based PDF? I suspicious that H & L and MadMel are correct about the conversion settings.
|
Instead of doing it from the print dialog, go to "Save As.." and select PDF from the Format drop-down menu. See if that works any better.
|
|
|
01-18-2011, 03:39 PM
|
#18
|
Redundant Minister of Redundancy
Join Date: Apr 2004
Location: Montreal
|
Quote:
Originally Posted by sclitheroe
To be quite honest, I don't think 5 to 7 megs counts as large anymore. Any email system out there is going to handle attachments that size without complaint. I wouldn't waste much time on it at this point.
|
I think it depends on who you're sending it to. If someone sent me a 5MB+ pdf full of nothing but text, I'd assume that person was computer illiterate -- akin to those types of people that like to email a .doc or .ppt containing nothing but an image(s) when they could have just attached the image(s) to the mail in the first place (for example).
I'd put it on par with receiving an email full of grammatical and spelling mistakes. Yeah, I can understand it, but it's still annoying.
|
|
|
01-18-2011, 07:50 PM
|
#19
|
#1 Goaltender
|
Quote:
Originally Posted by BlackEleven
I think it depends on who you're sending it to. If someone sent me a 5MB+ pdf full of nothing but text, I'd assume that person was computer illiterate
|
Why, it's not like they hand coded the individual bytes in the PDF file.
__________________
-Scott
|
|
|
01-18-2011, 08:29 PM
|
#20
|
Franchise Player
|
Quote:
Originally Posted by sclitheroe
Why, it's not like they hand coded the individual bytes in the PDF file.
|
speak for yourself.
__________________
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -6. The time now is 03:55 PM.
|
|