Calgarypuck Forums - The Unofficial Calgary Flames Fan Community

Go Back   Calgarypuck Forums - The Unofficial Calgary Flames Fan Community > Main Forums > The Off Topic Forum > Tech Talk
Register Forum Rules FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread
Old 05-15-2009, 10:37 AM   #1
MelBridgeman
Franchise Player
 
MelBridgeman's Avatar
 
Join Date: Mar 2007
Location: Calgary
Exp:
Default Regular Expression Experts?

Hey anyone in tune with regular expressions?

I am trying to replace the content attribute of HTML meta tags..

The issue is is that meta tags isnt closed by </meta> and the xpression i have created is actually working, but replacing the entire contents past the meta tag

I am using PHP and the eregi_replace function, for the title tags I am using
<title[^>]*>.*</title[^>]*> to replace the contents of the tags and it works great....Anyone have a idea how to parse out tags that don't have a terminate tag?

I am not a regular expression guru, just trying to go off examples on the net.
__________________
Quote:
Originally Posted by Katie Telford The chief of staff to the prime minister of Canada
“Line up all kinds of people to write op-eds.”
MelBridgeman is offline   Reply With Quote
Old 05-15-2009, 04:27 PM   #2
MelBridgeman
Franchise Player
 
MelBridgeman's Avatar
 
Join Date: Mar 2007
Location: Calgary
Exp:
Default

__________________
Quote:
Originally Posted by Katie Telford The chief of staff to the prime minister of Canada
“Line up all kinds of people to write op-eds.”
MelBridgeman is offline   Reply With Quote
Old 05-15-2009, 08:37 PM   #3
nfotiu
Franchise Player
 
Join Date: May 2002
Location: Virginia
Exp:
Default

What have you tried?

<meta[^>]*> should work, no?
nfotiu is offline   Reply With Quote
Old 05-16-2009, 12:48 AM   #4
MelBridgeman
Franchise Player
 
MelBridgeman's Avatar
 
Join Date: Mar 2007
Location: Calgary
Exp:
Default

It does not work, i think it is because some of my meta tags are structured like this

<meta name="keywords" content="<!--#include virtual="/includes/filename.html"-->">

basically i want to completely replace this entire tag with something else

sometimes they are structured like this

<meta name="keywords" content="<!--#include virtual="/includes/filename.html"-->"/>

the combination i tried similar to what you posted actually changed it correctly, but replaced everything after it too...
__________________
Quote:
Originally Posted by Katie Telford The chief of staff to the prime minister of Canada
“Line up all kinds of people to write op-eds.”
MelBridgeman is offline   Reply With Quote
Old 05-16-2009, 10:33 AM   #5
c.t.ner
First Line Centre
 
c.t.ner's Avatar
 
Join Date: Aug 2004
Location: Calgary in Heart, Ottawa in Body
Exp:
Default

Sorry, I'm not familiar with regular expression? Do you mean Expression Engine? (Just curious)
c.t.ner is offline   Reply With Quote
Old 05-16-2009, 10:42 AM   #6
missdpuck
Franchise Player
 
missdpuck's Avatar
 
Join Date: Jul 2008
Location: in a swamp, tied to a cypress tree
Exp:
Default

My regular expressions are "Don't call me babe" and the Bronx Cheer. Sorry.
__________________
http://arc4raptors.org
missdpuck is offline   Reply With Quote
Old 05-16-2009, 11:45 AM   #7
MelBridgeman
Franchise Player
 
MelBridgeman's Avatar
 
Join Date: Mar 2007
Location: Calgary
Exp:
Default

Quote:
Originally Posted by c.t.ner View Post
Sorry, I'm not familiar with regular expression? Do you mean Expression Engine? (Just curious)
http://en.wikipedia.org/wiki/Regular_expression
__________________
Quote:
Originally Posted by Katie Telford The chief of staff to the prime minister of Canada
“Line up all kinds of people to write op-eds.”
MelBridgeman is offline   Reply With Quote
The Following User Says Thank You to MelBridgeman For This Useful Post:
Old 05-16-2009, 11:52 AM   #8
photon
The new goggles also do nothing.
 
photon's Avatar
 
Join Date: Oct 2001
Location: Calgary
Exp:
Default

Sorry wish I knew something about them. My interactions with them usually entail a furious google search until I find what I need which I promptly forget.
__________________
Uncertainty is an uncomfortable position.
But certainty is an absurd one.
photon is offline   Reply With Quote
Old 05-16-2009, 01:26 PM   #9
MelBridgeman
Franchise Player
 
MelBridgeman's Avatar
 
Join Date: Mar 2007
Location: Calgary
Exp:
Default

Ya my problem is that i am expecting google to find me the answer, instead of using what i find and applying it...which is what i should be doing
__________________
Quote:
Originally Posted by Katie Telford The chief of staff to the prime minister of Canada
“Line up all kinds of people to write op-eds.”
MelBridgeman is offline   Reply With Quote
Old 05-16-2009, 09:03 PM   #10
llama64
First Line Centre
 
llama64's Avatar
 
Join Date: Nov 2006
Location: /dev/null
Exp:
Default

Quote:
Originally Posted by MelBridgeman View Post
Ya my problem is that i am expecting google to find me the answer, instead of using what i find and applying it...which is what i should be doing
Isn't that what all programmers do?



Regular Expressions make my brain hurt. If you do find a solution though, could you post it? I'd be curious to see why it isn't working.
llama64 is offline   Reply With Quote
Old 05-16-2009, 09:12 PM   #11
maverickstruth
Backup Goalie
 
maverickstruth's Avatar
 
Join Date: Mar 2006
Location: Calgary
Exp:
Default

Try something like... this?

Code:
<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)</\1>
(Got it from http://www.regular-expressions.info/examples.html, but basically all it's doing is finding a tag, then matching everything between that tag and it's closing tag using the \1 backreference)

Gah, just reread. You've got that part working already.

What about... (warning, this is really messy and probably won't work)

Code:
<([A-Z][A-Z0-9]*)(\b[^>]*)?(<!--(\b[^>]*)?>)?(\b[^>]*)?>
< to capture the beginning of the tag
([A-Z][A-Z0-9]*) to capture the tag itself (note, you may need to include lower case here)
(\b[^>]*)? to capture anything that does not include a >
(<!--(\b[^>]*)?>)? to capture zero or more words after <!-- but before >, which may never occur
(\b[^>]*)? to capture zero or more words after the html comment
> to capture the end of the tag

Last edited by maverickstruth; 05-16-2009 at 09:40 PM. Reason: correcting myself, hopefully
maverickstruth is offline   Reply With Quote
Old 05-17-2009, 09:12 AM   #12
sclitheroe
#1 Goaltender
 
Join Date: Sep 2005
Exp:
Default

Would a tool like this help: http://gskinner.com/RegExr/

Personally I would write a small AWK script to parse the text. AWK is a fantastic tool to have in your arsenal for when you need to do text manipulation. It's less obtuse than Perl, but still really powerful for working with text.
__________________
-Scott
sclitheroe is offline   Reply With Quote
Old 05-17-2009, 06:24 PM   #13
MelBridgeman
Franchise Player
 
MelBridgeman's Avatar
 
Join Date: Mar 2007
Location: Calgary
Exp:
Default

Quote:
Originally Posted by sclitheroe View Post
Would a tool like this help: http://gskinner.com/RegExr/

Personally I would write a small AWK script to parse the text. AWK is a fantastic tool to have in your arsenal for when you need to do text manipulation. It's less obtuse than Perl, but still really powerful for working with text.
nice link man
well try out some suggestions
__________________
Quote:
Originally Posted by Katie Telford The chief of staff to the prime minister of Canada
“Line up all kinds of people to write op-eds.”
MelBridgeman is offline   Reply With Quote
Old 05-18-2009, 12:03 AM   #14
Shazam
Franchise Player
 
Shazam's Avatar
 
Join Date: Aug 2005
Location: Memento Mori
Exp:
Default

Try

<meta\b[^>]*>(.*?)[^/]>

Should match all meta tags that aren't closed XHTML style, but not ones that are.
Shazam is offline   Reply With Quote
Old 05-18-2009, 09:12 AM   #15
nfotiu
Franchise Player
 
Join Date: May 2002
Location: Virginia
Exp:
Default

If the meta tag is always one line, and the only thing on the line, this would work:

<meta[^$]*>$

Or maybe alter a little to account for leading or trailing white space.

Last edited by nfotiu; 05-18-2009 at 09:27 AM.
nfotiu is offline   Reply With Quote
Old 05-18-2009, 10:58 AM   #16
Shazam
Franchise Player
 
Shazam's Avatar
 
Join Date: Aug 2005
Location: Memento Mori
Exp:
Default

Quote:
Originally Posted by nfotiu View Post
If the meta tag is always one line, and the only thing on the line, this would work:

<meta[^$]*>$

Or maybe alter a little to account for leading or trailing white space.
This works better than mine. Change to

<meta[^$]*[^/]>$

If you don't want to catch valid meta tags.
Shazam is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -6. The time now is 04:38 PM.

Calgary Flames
2023-24




Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright Calgarypuck 2021