Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Stripping html from c code

  1. #1
    Senior Member
    Join Date
    Jul 2001
    Posts
    280

    Stripping html from c code

    I need to strip HTML tags from C/C++ code. *I can locate the tags via regex, maybe something like:
    Code:
    <.*>
    The problem is that I get the stuff in between the tags, too. *For example, it will return:
    Code:
    <a href=http://bla blah blah>Text I Don't Want</a>
    Also, if the code has a conditional statement like this:
    Code:
    if ( i < 32 || i > 59)
    that will also return as a tag:
    Code:
    < 32 || i >
    Any ideas how to do this? *Thanks.

  2. #2

    Re: Stripping html from c code

    Use regex like this:

    Code:
    <a.*a>
    and

    Code:
    <[a-zA-Z].*[a-zA-Z]>

  3. #3
    Guest

    Re: Stripping html from c code

    Give me a program that's like this please. I'll write you a small script.

  4. #4
    Senior Member
    Join Date
    Jul 2001
    Posts
    280

    Re: Stripping html from c code

    Ok, here is some example code that has the HTML tags:
    Code:
    <html>
    <head>
    <title>codetest.html</title>
    </head>
    <body><pre>
    <span style='color=#000000'></span><span style='color:#008000'>#include <qdatetime.h>
    #include <qmainwindow.h>
    #include <qstatusbar.h>
    #include <qmessagebox.h>
    #include <qmenubar.h>
    #include <qapplication.h>
    #include <qpainter.h>
    #include <qprinter.h>
    #include <qlabel.h>
    #include <qimage.h>
    #include <qprogressdialog.h>
    #include "canvas.h"
    
    #include <stdlib.h>
    
    </span><span style='color:#009d0a'><b>// We use a global variable to save memory - all the brushes and pens in
    // the mesh are shared.
    </b></span><span style='color:#800000'>static</span><span style='color:#000000'> QBrush *tb = </span><span style='color:#0000ff'>0</span><span style='color:#000000'>;
    </span><span style='color:#800000'>static</span><span style='color:#000000'> QPen *tp = </span><span style='color:#0000ff'>0</span><span style='color:#000000'>;
    
    <b>class</b> EdgeItem;
    <b>class</b> NodeItem;
    
    <b>class</b> EdgeItem: <b>public</b> QCanvasLine
    {
    <b>public</b>:
      EdgeItem( NodeItem*, NodeItem*, QCanvas *canvas );
      </span><span style='color:#800000'>void</span><span style='color:#000000'> setFromPoint( </span><span style='color:#800000'>int</span><span style='color:#000000'> x, </span><span style='color:#800000'>int</span><span style='color:#000000'> y ) ;
      </span><span style='color:#800000'>void</span><span style='color:#000000'> setToPoint( </span><span style='color:#800000'>int</span><span style='color:#000000'> x, </span><span style='color:#800000'>int</span><span style='color:#000000'> y );
      </span><span style='color:#800000'>static</span><span style='color:#000000'> </span><span style='color:#800000'>int</span><span style='color:#000000'> count() { <b>return</b> c; }
      </span><span style='color:#800000'>void</span><span style='color:#000000'> moveBy(</span><span style='color:#800000'>double</span><span style='color:#000000'> dx, </span><span style='color:#800000'>double</span><span style='color:#000000'> dy);
    <b>private</b>:
      </span><span style='color:#800000'>static</span><span style='color:#000000'> </span><span style='color:#800000'>int</span><span style='color:#000000'> c;
    };
    
    </span><span style='color:#800000'>static</span><span style='color:#000000'> </span><span style='color:#800000'>const</span><span style='color:#000000'> </span><span style='color:#800000'>int</span><span style='color:#000000'> imageRTTI = </span><span style='color:#0000ff'>984376</span><span style='color:#000000'>;
    
    
    <b>class</b> ImageItem: <b>public</b> QCanvasRectangle
    {
    <b>public</b>:
      ImageItem( QImage img, QCanvas *canvas );
      </span><span style='color:#800000'>int</span><span style='color:#000000'> rtti () </span><span style='color:#800000'>const</span><span style='color:#000000'> { <b>return</b> imageRTTI; }
      </span><span style='color:#800000'>bool</span><span style='color:#000000'> hit( </span><span style='color:#800000'>const</span><span style='color:#000000'> QPoint&) </span><span style='color:#800000'>const</span><span style='color:#000000'>;
    <b>protected</b>:
      </span><span style='color:#800000'>void</span><span style='color:#000000'> drawShape( QPainter & );
    <b>private</b>:
      QImage image;
      QPixmap pixmap;
    };

  5. #5

    Re: Stripping html from c code

    Hmmm, why not just use w3m or links to output the formatted page, then copy and paste it?

  6. #6
    Guest

    Re: Stripping html from c code

    Take the code, put it into a .html file, then use lynx:

    % lynx -dump program.html > program.c

  7. #7
    Senior Member
    Join Date
    Jul 2001
    Posts
    280

    Re: Stripping html from c code

    I don't receive the code as one nice big HTML page that I can load into a browser. I get it one line at a time and I need to strip the html tags so I can reformat the line as needed because the line could have changed since the last time I received the line for formatting.

  8. #8
    Guest

    Re: Stripping html from c code

    Where do you recieved HTML'ed code from anyway?

  9. #9

    Re: Stripping html from c code

    I get it one line at a time and I need to strip the html tags
    If you're talking about a pipe, do something like this:

    Code:
    command_that_pipes_your_html_files| lynx -dump <(cat) > program.c
    (this will work, but I think lynx might be able to take input from stdin, too).

  10. #10
    Senior Member
    Join Date
    Jul 2001
    Posts
    280

    Re: Stripping html from c code

    I apologize for being unclear. *The c code (actually, its not c, but a scripting language with very similar syntax) is entered by the user into a text box in a program I'm working on. *The text is formated automatically by the text box into rich text for syntax highlighting. *The rich text is actually html mark up invisible to the user. *When I read a line of text, it gives me the html code as well but I need to get rid of those tags so I can see just the code. *I have the ability to search the string for words or regex and remove the results.

Similar Threads

  1. Html Help!!!
    By comtux in forum Linux - General Topics
    Replies: 4
    Last Post: 04-28-2005, 12:08 AM
  2. Help with html
    By boblucci in forum Linux - Software, Applications & Programming
    Replies: 8
    Last Post: 07-12-2004, 02:04 PM
  3. How do i do html code??
    By in forum Windows - General Topics
    Replies: 1
    Last Post: 01-05-2004, 06:52 PM
  4. vim and html
    By NGene in forum Linux - Software, Applications & Programming
    Replies: 7
    Last Post: 04-13-2002, 02:00 PM
  5. need help with html
    By boblucci in forum Linux - Hardware, Networking & Security
    Replies: 2
    Last Post: 01-29-2002, 09:39 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •