I'm sorry for asking this here: HTMLPurifier isn't purifying submitted data, but it purifies a test echo.

4    20 Aug 2015 01:50 by u/Codewow

I've been googling this for weeks now and I have not figured out why it's not working. I've already tried asking some ProgrammingHelp subs and no one has ever replied... So here I am before I go to Reddit and dreadit.

$config = HTMLPurifier_Config::createDefault();
$config->set('CSS.AllowedProperties', array());
$config->set('CSS.ForbiddenProperties', array('height,width'));
$config->set('HTML.Allowed', 'u,p,b,i,span[style],p,strong,em,li,ul,ol,div[align],br,a[href],hr');
$config->set('HTML.AllowedAttributes', 'a.href,src,alt');
$config->set('HTML.ForbiddenAttributes', array('style'));
$purifier = new HTMLPurifier($config);

Based on this config set, it should work.

I echo out this on the same page without running it through $_POST:

echo $purifier->purify('<img src="" width="20" height="20" />');

And it works just fine.

But once I submit it through $_POST to the database it doesn't do anything to it. Or well, it didn't do anything to begin with.

Why could this be?

Here's a pastebin... please pick it apart and let me know if you find anything...

http://pastebin.com/Vqcn7QbF

7 comments

0

What do you mean "submit it through $_POST"? I don't think it supports $purifier->purify($_POST) if that's what you're doing. In any case, you'll have better luck on stack overflow.

0

So I ran the data through the purifier, then submit it to the database through this:

$clean_html = mysqli_real_escape_string($conn, $_POST['Story']);
$sql="INSERT INTO published (Title, pageTitle, Story, Date)
VALUES ('$Title', '$Title', '$clean_html', '$Date')";
if (!mysqli_query($conn,$sql))

I left out the other sections of the code for space and cleanliness. I'll attempt stackoverflow next. They usually direct me to an answer the doesn't work and close my question, so I have better luck on places like this.

0

So to be clear, your full code snippet looks like this?

$config = HTMLPurifier_Config::createDefault();
$config->set('CSS.AllowedProperties', array());
$config->set('CSS.ForbiddenProperties', array('height,width'));
$config->set('HTML.Allowed', 'u,p,b,i,span[style],p,strong,em,li,ul,ol,div[align],br,a[href],hr');
$config->set('HTML.AllowedAttributes', 'a.href,src,alt');
$config->set('HTML.ForbiddenAttributes', array('style'));
$purifier = new HTMLPurifier($config);
$_POST['Story'] = $purifier->purify($_POST['Story']);
$clean_html = mysqli_real_escape_string($conn, $_POST['Story']);
$sql="INSERT INTO published (Title, pageTitle, Story, Date)
VALUES ('$Title', '$Title', '$clean_html', '$Date')";
if (!mysqli_query($conn,$sql))
    die('Crap.');

For starters, I'd suggest using prepared statements and bind_param() instead of inserting escaped text directly into the SQL text. Yes, it'll work but IIRC, prepared statements are more efficient.

Next, I'd suggest changing your code to print $_POST['Story'] directly onto the page, and (if you want to stay with your string-interpolation approach to the SQL statement) print out the SQL as well. That should let you know whether it's a problem with inserting into the database (maybe your DB user doesn't have write access?) or with the $_POST variable.

Lastly I don't know if there's any real reason not to do it, but writing back to $_POST feels icky. I'd suggest using a different variable to hold the scrubbed HTML.

0

Nah that's not how it looks. For the full page here's a pastebin: http://pastebin.com/Vqcn7QbF

What it's posting to database:

<p><img alt="" src="../images/article/rockband3.jpg" style="height:7px; width:12px" /></p>

Which shouldn't be the case. I removed the style attributes so they shouldn't be allowed and should automatically be removed. But they aren't... Which leads me to believe HTMLPurifier isn't actually purifying the data. And every time I try to echo to the same page it is completely broken and returns:

<img alt="\"\"" src="\"../images/article/image.jpg\"" style="" width:1222px\"=""> 

Completely different from what it submits to the database.

1

Ah ok, gotchya. Thanks for the full text, that makes it easier. :)

The problem is that you're purifying $Story into $clean_html, but then when you escape it, you're escaping the original $_POST['Story'] rather than escaping your purified $clean_html. So instead of this at line 86:

$clean_html = $purifier->purify($Story);
$clean_html = mysqli_real_escape_string($conn, $_POST['Story']);

you want this:

$Story = $purifier->purify($_POST['Story']);
$clean_html = mysqli_real_escape_string($conn, $Story);

Edit: Whoops, I got it swapped around too. Fixed now.

0

When I get home to fix this and this works I am going to cry... Such a stupid mistake that I couldn't notice...

Also I forgot to thank you. So thank you so much for pointing out my stupid mistake!

1

Glad to help! :) It's the simplest mistakes that are always the hardest to spot.