Home » PHP2 July 2008

How to get(view) html source code of a website

In this article, I will be illustrating about getting html source code of any website. I have done this in PHP.

I created an html form for submitting the website url. After the url is submitted, the PHP code does all the magic to display full html source code of that particular website. The source code is displayed in a textarea.

$domain = $_POST['domain'];

$handle = fopen(“http://$domain”,”r”);

//$contents = stream_get_contents($handle);

$contents = ”;

while (!feof($handle)) {

$contents .= fread($handle, 8192);

}

fclose($handle);

At first the website url submitted through the html form is opened with fopen(). Then, using while loop, the $contents string variable is populated (concatenated) until the end of file is reached. Finally, the file/url pointer is closed with fclose().

You can also get all the contents without using loop. You can get it with the stream_get_contents() function. But this function is only supported by PHP 5+. stream_get_contents() operates on an already open stream resource and returns the remaining contents in a string. stream_get_contents() will not work in PHP versions lower than PHP 5. If you are using PHP versions lower than PHP 5, then better not use it.

Description of functions used:

fopen() opens the url. The mode used here (i.e. ‘r’) is read mode.

feof() tests for end-of-file on a file pointer. It returns true if the file pointer is at EOF; otherwise it returns false.

fread() reads up to length bytes from the file pointer referenced by handle. Reading stops when up to length bytes have been read, EOF (end of file) is reached, (for network streams) when a packet becomes available, or (after opening userspace stream) when 8192 bytes have been read whichever comes first.

fclose() closes an open file pointer.

View it live

Full source code:

<?php
if(isset($_POST['submit']))
{
$domain = $_POST['domain'];
$handle = fopen("http://$domain","r");
//$contents = stream_get_contents($handle);
$contents = '';
while (!feof($handle)) {
$contents .= fread($handle, 8192);
}
//var_dump($contents);
}</p>
?></p>
<html>
<head>
<title>HTML Source Code Viewer</title>
</head>
<body></p>
<h2>
HTML Source Code Viewer
</h2></p>
<form method="post" name="pageform" action="" onsubmit="return validate(this);"></p>
<table border="0" style="border-collapse: collapse" width="">
<tr>
<td width="" height="91" valign="top">
<table style="border-collapse: collapse" width="" class="tooltop" height="76"></p>
<tr>
<td>
<table border="0" style="border-collapse: collapse" width="" cellspacing="5">
<tr>
<td height="28" width="100"><font size="2"><b>View source of</b></font><b><font size="2">:
</font></b></td>
<td height="28" width="">
<font size="1">http://</font><input type="text" name="domain" size="26" value="<?=$_POST['domain']?>"></td>
<td height="28" width="">
<input type="submit" name="submit" value="View!" style="float: left"></td>
</tr>
<tr>
<td width="" height="21">&nbsp;</td>
<td width="" colspan="2" height="21" valign="top"><font size="1">(eg. chapagain.com.np)</font></td></p>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
<?php
if(isset($_POST['submit']))
{
?>
<tr>
<td>
<textarea rows="10" cols="60" name="code"><?=$contents?></textarea>
</td>
</tr>
<?php
}
?>
</table>
</form>
<script language="JavaScript">
function validate(theform) {
if (theform.domain.value == "") { alert("No Domain"); return false; }
return true;
}
</script>

</body>
</html>

From Mukesh Chapagain's Blog, post How to get(view) html source code of a website

php magento mukesh chapagain

Get New Post by Email

Find me on

Facebook Twitter Google+ LinkedIn RSS Feed
  • http://roshanbh.com.np Roshan Bhattarai

    Thanks a lot for tons for support mukesh…

    BTW….if you’re using PHP 4.3 or more you can use

    $htmlval=file_get_contents(“http://example.com”);

    to do this in a easy way but fopen_wrappers must be enabled in PHP setting to get the content from URL using this function and in most of the server they are enabled by default..

  • Mackenzie

    hey we got the interface, but when we typed in the URL in the box it didnt work. please assist us.

  • http://www.chapagain.com.np admin

    I think that, you wrote the url as “http://www.chapagain.com.np” . Don’t put “http://” in front. Just write “www.chapagain.com.np” (without quote) or you can simply write “chapagain.com.np”. (I have given my website url as example. You can use your own.)

  • http://oscargodson.com Oscar Godson

    I wonder if you can get generated content? I use a lot of JS all the time on my sites and if I could get generated content that would be cool, unfortunately I know this would be impossible since it would have to run the actual JS. Dang!