Calgarypuck Forums - The Unofficial Calgary Flames Fan Community

Go Back   Calgarypuck Forums - The Unofficial Calgary Flames Fan Community > Main Forums > The Off Topic Forum > Tech Talk
Register Forum Rules FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread
Old 01-15-2011, 02:53 PM   #1
Textcritic
Acerbic Cyberbully
 
Textcritic's Avatar
 
Join Date: Aug 2003
Location: back in Chilliwack
Exp:
Default File Compression Help

I need to email a fairly small text file (four A4 pages consisting of 2300 words, and three fonts) as a PDF, but when I go to convert it, it ends up as a 5–7 mB behemoth. I have done hundreds of conversions of text files before; some that include complex images and tables, and I have never encountered this problem in the past. When I converted my +300-page dissertation into a PDF, it resulted in a very reasonable 1.3 mB file. I have attempted to compress using my word processor, Adobe Acrobat, and Stuffit Deluxe, but the smallest I can get the thing is 5.5 mB. What is going on?

Any ideas why, or any thoughts about how to reduce this to a manageable size? I REALLY hate emailing such large items, so any help on this would be most appreciated.
__________________
Dealing with Everything from Dead Sea Scrolls to Red C Trolls

Quote:
Originally Posted by woob
"...harem warfare? like all your wives dressup and go paintballing?"
"The Lying Pen of Scribes" Ancient Manuscript Forgeries Project
Textcritic is offline   Reply With Quote
Old 01-15-2011, 03:58 PM   #2
Hack&Lube
Atomic Nerd
 
Join Date: Jul 2004
Location: Calgary
Exp:
Default

Do you you have the full version of Acrobat? Try doing this, print out the text and scan it back in with OCR and make a new PDF. It should be pretty small. I have no idea why it would be so big. How are you converting it? The issue is not with compression, it's with your conversion. You are doing something wrong there. Are you sure you aren't saving it as an image PDF somehow instead of text based one?

If all else fails and you want to compress your PDF, try WinRAR.
Hack&Lube is offline   Reply With Quote
Old 01-15-2011, 04:06 PM   #3
Mad Mel
First Line Centre
 
Mad Mel's Avatar
 
Join Date: Mar 2009
Location: Brisbane, Australia
Exp:
Default

I'm wondering if it is turning it into an image in the PDF, rather than text. In the PDF, can you select text? Or is it like a picture? If so, there may be a setting in your exporter which is forcing it to an image rather than postscript. Or perhaps a font you're using isn't supported by PDF, so it's rasterizing the output. As a test, you could try changing the whole thing to Arial or Times New Roman, and see if it magically fixes it.
Mad Mel is offline   Reply With Quote
Old 01-15-2011, 04:09 PM   #4
photon
The new goggles also do nothing.
 
photon's Avatar
 
Join Date: Oct 2001
Location: Calgary
Exp:
Default

Is the original text file actually text? Or is it an image format of somekind?

IF you create a PDF of actual text, it only has to store the letters and some extra info, but if the source is an image, it can only put the entire image.

So like H&L suggests if it's an image original, you can do some OCR on it to convert it into actual text.
photon is offline   Reply With Quote
Old 01-15-2011, 04:35 PM   #5
Textcritic
Acerbic Cyberbully
 
Textcritic's Avatar
 
Join Date: Aug 2003
Location: back in Chilliwack
Exp:
Default

Thanks for your suggestions.

I am using a MacBook Pro 240 Ghz Intel Core i5, and am running Office for Mac 2008 and Acrobat 9 Pro (v. 9.4.1) respectively.

I have attempted the PDF conversion in the Word print dialogue box, using my home printer-driver as well as the Xerox printer at work, and as far as I can tell, both are set to grayscale text document settings. I have attempted the same conversion in Acrobat Pro directly, while selecting "smallest file size" in the "Adobe PDF Settings" dialogue box.

I have used the same fonts in literally dozens of other conversions, so I am fairly certain that that is not the issue. Is there a way to change the settings in the exporter to ensure that this is a text based PDF? I suspicious that H & L and MadMel are correct about the conversion settings.
__________________
Dealing with Everything from Dead Sea Scrolls to Red C Trolls

Quote:
Originally Posted by woob
"...harem warfare? like all your wives dressup and go paintballing?"
"The Lying Pen of Scribes" Ancient Manuscript Forgeries Project
Textcritic is offline   Reply With Quote
Old 01-15-2011, 04:36 PM   #6
photon
The new goggles also do nothing.
 
photon's Avatar
 
Join Date: Oct 2001
Location: Calgary
Exp:
Default

Is the source document editable?
photon is offline   Reply With Quote
Old 01-15-2011, 05:01 PM   #7
Textcritic
Acerbic Cyberbully
 
Textcritic's Avatar
 
Join Date: Aug 2003
Location: back in Chilliwack
Exp:
Default

Quote:
Originally Posted by photon View Post
Is the source document editable?
It is a standard Word document (.doc). I have always tried avoiding .docx files simply because I have a number of colleagues in Europe who do not have systems that can read these. It is editable, and I am in fact working on it as we speak. The text file itself is only 78 kB.
__________________
Dealing with Everything from Dead Sea Scrolls to Red C Trolls

Quote:
Originally Posted by woob
"...harem warfare? like all your wives dressup and go paintballing?"
"The Lying Pen of Scribes" Ancient Manuscript Forgeries Project
Textcritic is offline   Reply With Quote
Old 01-15-2011, 05:17 PM   #8
Textcritic
Acerbic Cyberbully
 
Textcritic's Avatar
 
Join Date: Aug 2003
Location: back in Chilliwack
Exp:
Default

Quote:
Originally Posted by Mad Mel View Post
I...As a test, you could try changing the whole thing to Arial or Times New Roman, and see if it magically fixes it.
I attempted this, turning the whole thing into Calibri and it served to make things even worse. My 5-page, single-font PDF is a whopping 10.5 mB!!
__________________
Dealing with Everything from Dead Sea Scrolls to Red C Trolls

Quote:
Originally Posted by woob
"...harem warfare? like all your wives dressup and go paintballing?"
"The Lying Pen of Scribes" Ancient Manuscript Forgeries Project
Textcritic is offline   Reply With Quote
Old 01-15-2011, 05:19 PM   #9
Mad Mel
First Line Centre
 
Mad Mel's Avatar
 
Join Date: Mar 2009
Location: Brisbane, Australia
Exp:
Default

I have Office 2008 on my laptop (PC), so I just did a quick test. I copied the content of this thread, pasted it into Word, saved it (2008 format, docx), then saved to pdf (Save As PDF or XPS menu item, not printing to an Adobe driver). The Word doc was 112kb, the pdf was 399. It did contain some images (posters avatars), but the text did save to pdf as text. Is that ratio (about 4x the size) the same as you're getting?
Mad Mel is offline   Reply With Quote
Old 01-15-2011, 05:22 PM   #10
Textcritic
Acerbic Cyberbully
 
Textcritic's Avatar
 
Join Date: Aug 2003
Location: back in Chilliwack
Exp:
Default

Quote:
Originally Posted by Mad Mel View Post
I have Office 2008 on my laptop (PC), so I just did a quick test. I copied the content of this thread, pasted it into Word, saved it (2008 format, docx), then saved to pdf (Save As PDF or XPS menu item, not printing to an Adobe driver). The Word doc was 112kb, the pdf was 399. It did contain some images (posters avatars), but the text did save to pdf as text. Is that ratio (about 4x the size) the same as you're getting?
That sounds about average for what I normally produce using the print dialogue box to create PDFs. For the document that I am experiencing problems with, the increase is MASSIVE. The doc itself is 78 kB (.doc), and the PDFs generated are consistently 5.5–7 mB.
__________________
Dealing with Everything from Dead Sea Scrolls to Red C Trolls

Quote:
Originally Posted by woob
"...harem warfare? like all your wives dressup and go paintballing?"
"The Lying Pen of Scribes" Ancient Manuscript Forgeries Project
Textcritic is offline   Reply With Quote
Old 01-15-2011, 05:23 PM   #11
Mad Mel
First Line Centre
 
Mad Mel's Avatar
 
Join Date: Mar 2009
Location: Brisbane, Australia
Exp:
Default

I just looked at the "Save As PDF" options... there's a checkbox for "Bitmap fonts when text may not be embedded". If that is turned on, it could cause bloating.
Mad Mel is offline   Reply With Quote
Old 01-15-2011, 05:45 PM   #12
sclitheroe
#1 Goaltender
 
Join Date: Sep 2005
Exp:
Default

To be quite honest, I don't think 5 to 7 megs counts as large anymore. Any email system out there is going to handle attachments that size without complaint. I wouldn't waste much time on it at this point.
__________________
-Scott
sclitheroe is offline   Reply With Quote
Old 01-15-2011, 09:18 PM   #13
corporatejay
Franchise Player
 
corporatejay's Avatar
 
Join Date: Jul 2005
Exp:
Default

Quote:
Originally Posted by sclitheroe View Post
To be quite honest, I don't think 5 to 7 megs counts as large anymore. Any email system out there is going to handle attachments that size without complaint. I wouldn't waste much time on it at this point.
While I agree that wouldn't be considered "large" you'd be shocked at how many big company networks kick back files that are larger than 5-10 MB.
__________________
corporatejay is offline   Reply With Quote
Old 01-15-2011, 11:14 PM   #14
Barnes
Franchise Player
 
Barnes's Avatar
 
Join Date: Aug 2005
Location: Violating Copyrights
Exp:
Default

Yeah, it sounds like the fonts are being embeded.
Barnes is offline   Reply With Quote
Old 01-16-2011, 09:23 AM   #15
sclitheroe
#1 Goaltender
 
Join Date: Sep 2005
Exp:
Default

Quote:
Originally Posted by corporatejay View Post
While I agree that wouldn't be considered "large" you'd be shocked at how many big company networks kick back files that are larger than 5-10 MB.
I support a lot of companies - I haven't seen one refuse a 5 meg attachment in a very long time. Exchange 2007 and 2010 even ship with a 10 meg limit out of the box as their default.
__________________
-Scott
sclitheroe is offline   Reply With Quote
The Following User Says Thank You to sclitheroe For This Useful Post:
Old 01-18-2011, 03:16 PM   #16
Hack&Lube
Atomic Nerd
 
Join Date: Jul 2004
Location: Calgary
Exp:
Default

I still think it's either saving it as an image (ie: it mentions greyscale text mode) or it is embedding extra information that is uneccessary. It may be because you are doing it from the print menu instead of the save as menu. When I used to do this at work, it would default to imaging mode when dealing with the network printer instead of text mode.

If you are comfortable with it, perhaps use one of the online free PDF conversion services and see what they return. I'm guessing they will be more concerned with bandwidth and will automatically default to the best text format and compression setting.

http://www.google.ca/search?hl=en&so...-s1g1&aql=&oq=

Last edited by Hack&Lube; 01-18-2011 at 03:30 PM.
Hack&Lube is offline   Reply With Quote
Old 01-18-2011, 03:24 PM   #17
BlackEleven
Redundant Minister of Redundancy
 
BlackEleven's Avatar
 
Join Date: Apr 2004
Location: Montreal
Exp:
Default

Quote:
Originally Posted by Textcritic View Post
Thanks for your suggestions.

I am using a MacBook Pro 240 Ghz Intel Core i5, and am running Office for Mac 2008 and Acrobat 9 Pro (v. 9.4.1) respectively.

I have attempted the PDF conversion in the Word print dialogue box, using my home printer-driver as well as the Xerox printer at work, and as far as I can tell, both are set to grayscale text document settings. I have attempted the same conversion in Acrobat Pro directly, while selecting "smallest file size" in the "Adobe PDF Settings" dialogue box.

I have used the same fonts in literally dozens of other conversions, so I am fairly certain that that is not the issue. Is there a way to change the settings in the exporter to ensure that this is a text based PDF? I suspicious that H & L and MadMel are correct about the conversion settings.
Instead of doing it from the print dialog, go to "Save As.." and select PDF from the Format drop-down menu. See if that works any better.
BlackEleven is offline   Reply With Quote
Old 01-18-2011, 03:39 PM   #18
BlackEleven
Redundant Minister of Redundancy
 
BlackEleven's Avatar
 
Join Date: Apr 2004
Location: Montreal
Exp:
Default

Quote:
Originally Posted by sclitheroe View Post
To be quite honest, I don't think 5 to 7 megs counts as large anymore. Any email system out there is going to handle attachments that size without complaint. I wouldn't waste much time on it at this point.
I think it depends on who you're sending it to. If someone sent me a 5MB+ pdf full of nothing but text, I'd assume that person was computer illiterate -- akin to those types of people that like to email a .doc or .ppt containing nothing but an image(s) when they could have just attached the image(s) to the mail in the first place (for example).

I'd put it on par with receiving an email full of grammatical and spelling mistakes. Yeah, I can understand it, but it's still annoying.
BlackEleven is offline   Reply With Quote
Old 01-18-2011, 07:50 PM   #19
sclitheroe
#1 Goaltender
 
Join Date: Sep 2005
Exp:
Default

Quote:
Originally Posted by BlackEleven View Post
I think it depends on who you're sending it to. If someone sent me a 5MB+ pdf full of nothing but text, I'd assume that person was computer illiterate
Why, it's not like they hand coded the individual bytes in the PDF file.
__________________
-Scott
sclitheroe is offline   Reply With Quote
Old 01-18-2011, 08:29 PM   #20
corporatejay
Franchise Player
 
corporatejay's Avatar
 
Join Date: Jul 2005
Exp:
Default

Quote:
Originally Posted by sclitheroe View Post
Why, it's not like they hand coded the individual bytes in the PDF file.

speak for yourself.
__________________
corporatejay is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -6. The time now is 04:50 PM.

Calgary Flames
2023-24




Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright Calgarypuck 2021