Bringing a Paginated API
This documentation uses Power BI as an example, but it can be transposed to other BI tools, or other tools that can consume a paginated API.
1. Two Worlds, Two Logics
Concept | What it is | How it acts |
---|---|---|
Relational Table (Power BI) | A fixed-size grid of rows and columns. You ask for it once and get the whole set. | “All the data is already on the shelf; pick it up in one go.” |
Paginated API Endpoint | A web service that delivers data in slices (“pages”) of 10, 50, 100… records at a time. Each page includes a pointer to the next one. | “The data is in a novel—read page 1, then page 2, and so on.” |
2. The Integration Strategy
Think of yourself as building a small courier service between the novel (API) and the shelf (Power BI):
- Initial Request – Get Page 1
- Send the first GET request (often with
page=1
andpage_size=
n``).
- Send the first GET request (often with
- Loop – Keep Turning the Pages
- Inspect the response: it contains both the records and a hint about the next page (
nextPageToken
,page=2
, etc.). - Repeat the request with the new page parameters until the API says, “No more pages.”
- Inspect the response: it contains both the records and a hint about the next page (
- Accumulate – Build a Single Dataset
- Append each page’s rows into one growing collection (a list, DataFrame, or Power Query table).
- Once the loop ends, you have a complete, flat table.
- Load into Power BI
- Deliver that finished table to Power BI (as a CSV, parquet, database table, or via Power Query’s
Web.Contents
function wrapped in a loop).
- Deliver that finished table to Power BI (as a CSV, parquet, database table, or via Power Query’s
3. Where the Script Fits
Power BI itself cannot iterate over pages; the script acts as the translator
Paginated API → Script (loops, concatenates) → Single flat file/table → Power BI
- What language? Common choices are Python (with pandas & requests) or Power Query M (using
List.Generate
for the loop). - What does it do?
- Handles authentication once.
- Executes the loop described above.
- Writes/returns a consolidated table Power BI can refresh.
Key takeaway: Pagination is only an API-side efficiency mechanism; your script neutralizes it by stitching every page back together before Power BI ever sees the data.
4. Refreshing Going Forward
- Scheduled refresh in Power BI simply reruns the script.
- The loop safeguards you against future growth—if the task list doubles, the script just turns more pages.
5. Mental Model & Analogy
API Novel | Script Courier | Power BI Library |
---|---|---|
Chapters/Pagination | Courier reads all chapters | Final bound book on the shelf |
Whenever you refresh, the courier re-reads the novel from chapter 1 to “The End,” prints a fresh copy, and hands it to the library. Power BI patrons never worry about chapters—they always get the full story.
Summary for the Data Analyst
- Pagination ≠ Table – The API sends one “chapter” at a time.
- Use a Looping Script – It turns the pages, collects everything, and hands Power BI a single table.
- Zero Code Needed to Understand – Just remember: read all pages, append them together, then load.
Once you’re comfortable with that concept, the actual Python/Power Query example you already have will make perfect sense—it’s simply the automation of the page-turning process.