Symbolic Paging - Insert page breaks by text, not punctuation or code
I am writing code for character based generation. I have articles on my site that I want to split by length.
The code I am still working on, although two questions:
- It breaks pages in the middle of HTML words and tags; I only want the separation after the full word, label, or punctuation marks.
- In the pagination panel, it generates the wrong number of pages.
The pagination bar that generates the wrong number of pages.
Need help solving these two problems. The code follows:
$text = file_get_contents($View);
$ArticleLength = strlen($text);
$CharsPerPage = 5000;
$NoOfPages = round((double)$ArticleLength / (double)$CharsPerPage);
$CurrentPage = $this->ReturnNeededObject('pagenumber');
$Page = (isset($CurrentPage) && '' !== $CurrentPage) ? $CurrentPage : '1';
$PageText = substr($text, $CharsPerPage*($Page-1), $CharsPerPage);
echo $PageText, '<p>';
for ($i=1; $i<$NoOfPages+1; $i++)
{
if ($i == $CurrentPage)
{
echo '<strong>', $i, '</strong>';
}
else
{
echo '<a href="', $i, '">', $i, '</a>';
}
echo ' | ';
}
echo '</p>';
What am I doing wrong?
$NoOfPages = round((double)$ArticleLength / (double)$CharsPerPage);
This should use ceil instead of round - if you are using round 4.2 pages will only display 1-4 - you need a 5th page to show the last .2 pages.
The other part is trickier ... its common to use some sort of marker in the file to indicate where the page break occurs, no matter how smart your code is, it can't appreciate where a good break in the same way a person might.
If you insist on doing this, suggest some kind of logic that runs forward / backward to the nearest space first when a page break is created, which isn't too hard. More difficult is deciding when you are in the tag or not. Think you need some pretty heavy regex or HTML parsing tool.
source to share
You are miscalculating the number of pages ... you must use ceil()
not round()
(for example, there are 5 pages left to display on 4.1 pages of text).
To fix another problem, you will be in big trouble if there is any HTML in there. For example, you need to know what is <div>
and <p>
is ok in order to smash, but <table>
not (unless you want you to really get excited)!
To do this correctly, you have to use the HTML library to build an element tree and then navigate from there.
source to share
As per your first statement
It breaks pages in the middle of HTML words and tags
it looks like your character count is executed after the markup is inserted. This implies that, for example, long URLs in links will be counted against the length of the page you are trying to reach. However, you did not say how the articles were originally created.
I would suggest finding a point in the article creation process where you could explore the source code. By examining the actual content (no markup) as a set of paragraphs and estimating the vertical length of each paragraph based on the typical number of characters per line, you can find a more consistent definition of size.
I would also consider just a break between paragraphs to combine the units of thought on the same page. Speaking as a reader, I really hate going to sites that make me stop, click a button or link and wait for the page to reload, all in the middle of one thought.
source to share