How to get likers and commenters of public posts

In this article, I would like to give a basic guide on how to download data from Facebook API about public posts of public Facebook pages. You may use such data for social network analysis, for text mining or basic statistics.

Firstly, you need to have a Facebook account, with a developer´s account. Facebook offers its guide how to set it up. You need to create your own app and get your unique token and id.

Then, you need to get an id of the Facebook page. One way how to get an id is to find it in the source code of the Facebook page. Then it is necessary to open Facebook Graph API Explorer. There, you enter a request for the list of posts for a specific page. The example is for the British magazine the Guardian.

In the end of the page, you may see the “paging” part of the results.

Click the link at “next”. Again, find the “paging” part of the results and copy the URL at “previous”.

You need to get these URLs for all desired Facebook pages. Create a list of these pages, with 2 columns: the name of the page (the column name “Name”) and the URL (the column name “URL”) and save it in the csv data format. Now we are ready to work with Open Refine.

The next step is to use Open refine to get the list of the most recent posts. If you managed to create a csv file in the last step, you may just import it into Open Refine and use this script.

[
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 1 at index 2 by fetching URLs based on column URL using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "1",
    "columnInsertIndex": 2,
    "baseColumnName": "URL",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column 1a at index 3 based on column 1 using expression grel:replace(value,/.*next\\\"\\:\\\"/,\"\")",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "1a",
    "columnInsertIndex": 3,
    "baseColumnName": "1",
    "expression": "grel:replace(value,/.*next\\\"\\:\\\"/,\"\")",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition",
    "description": "Create column n1 at index 3 based on column 1 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n1",
    "columnInsertIndex": 3,
    "baseColumnName": "1",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column 1a",
    "columnName": "1a"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 2 at index 4 by fetching URLs based on column n1 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "2",
    "columnInsertIndex": 4,
    "baseColumnName": "n1",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n2 at index 5 based on column 2 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n2",
    "columnInsertIndex": 5,
    "baseColumnName": "2",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 3 at index 6 by fetching URLs based on column n2 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "3",
    "columnInsertIndex": 6,
    "baseColumnName": "n2",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n3 at index 7 based on column 3 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n3",
    "columnInsertIndex": 7,
    "baseColumnName": "3",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 4 at index 8 by fetching URLs based on column n3 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "4",
    "columnInsertIndex": 8,
    "baseColumnName": "n3",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n4 at index 9 based on column 4 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n4",
    "columnInsertIndex": 9,
    "baseColumnName": "4",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 5 at index 10 by fetching URLs based on column n4 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "5",
    "columnInsertIndex": 10,
    "baseColumnName": "n4",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n5 at index 11 based on column 5 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n5",
    "columnInsertIndex": 11,
    "baseColumnName": "5",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 6 at index 12 by fetching URLs based on column n5 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "6",
    "columnInsertIndex": 12,
    "baseColumnName": "n5",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n6 at index 13 based on column 6 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n6",
    "columnInsertIndex": 13,
    "baseColumnName": "6",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 7 at index 14 by fetching URLs based on column n6 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "7",
    "columnInsertIndex": 14,
    "baseColumnName": "n6",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n7 at index 15 based on column 7 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n7",
    "columnInsertIndex": 15,
    "baseColumnName": "7",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 8 at index 16 by fetching URLs based on column n7 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "8",
    "columnInsertIndex": 16,
    "baseColumnName": "n7",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n8 at index 17 based on column 8 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n8",
    "columnInsertIndex": 17,
    "baseColumnName": "8",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 9 at index 18 by fetching URLs based on column n8 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "9",
    "columnInsertIndex": 18,
    "baseColumnName": "n8",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n9 at index 19 based on column 9 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n9",
    "columnInsertIndex": 19,
    "baseColumnName": "9",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 10 at index 20 by fetching URLs based on column n9 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "10",
    "columnInsertIndex": 20,
    "baseColumnName": "n9",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n10 at index 21 based on column 10 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n10",
    "columnInsertIndex": 21,
    "baseColumnName": "10",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 11 at index 22 by fetching URLs based on column n10 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "11",
    "columnInsertIndex": 22,
    "baseColumnName": "n10",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n11 at index 23 based on column 11 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n11",
    "columnInsertIndex": 23,
    "baseColumnName": "11",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 12 at index 24 by fetching URLs based on column n11 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "12",
    "columnInsertIndex": 24,
    "baseColumnName": "n11",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n12 at index 25 based on column 12 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n12",
    "columnInsertIndex": 25,
    "baseColumnName": "12",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 13 at index 26 by fetching URLs based on column n12 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "13",
    "columnInsertIndex": 26,
    "baseColumnName": "n12",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n13 at index 27 based on column 13 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n13",
    "columnInsertIndex": 27,
    "baseColumnName": "13",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 14 at index 28 by fetching URLs based on column n13 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "14",
    "columnInsertIndex": 28,
    "baseColumnName": "n13",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n14 at index 29 based on column 14 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n14",
    "columnInsertIndex": 29,
    "baseColumnName": "14",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition-by-fetching-urls",
    "description": "Create column 15 at index 30 by fetching URLs based on column n14 using expression grel:value",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "15",
    "columnInsertIndex": 30,
    "baseColumnName": "n14",
    "urlExpression": "grel:value",
    "onError": "set-to-blank",
    "delay": 5000
  },
  {
    "op": "core/column-addition",
    "description": "Create column n15 at index 31 based on column 15 using expression grel:value.parseJson().paging.next",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "n15",
    "columnInsertIndex": 31,
    "baseColumnName": "15",
    "expression": "grel:value.parseJson().paging.next",
    "onError": "set-to-blank"
  },
  {
    "op": "core/transpose-columns-into-rows",
    "description": "Transpose cells in columns starting with URL into rows in one new column named code",
    "startColumnName": "URL",
    "columnCount": -1,
    "ignoreBlankCells": true,
    "fillDown": true,
    "combinedColumnName": "code",
    "prependColumnName": false,
    "separator": ":"
  },
  {
    "op": "core/column-addition",
    "description": "Create column mess at index 2 based on column code using expression grel:forEach(value.parseJson().data,v,[v.message,v.id,v.created_time].join(\"||\")).join(\":::\")",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "newColumnName": "mess",
    "columnInsertIndex": 2,
    "baseColumnName": "code",
    "expression": "grel:forEach(value.parseJson().data,v,[v.message,v.id,v.created_time].join(\"||\")).join(\":::\")",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-split",
    "description": "Split column mess by separator",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "columnName": "mess",
    "guessCellType": true,
    "removeOriginalColumn": true,
    "mode": "separator",
    "separator": ":::",
    "regex": false,
    "maxColumns": 0
  },
  {
    "op": "core/column-removal",
    "description": "Remove column code",
    "columnName": "code"
  },
  {
    "op": "core/row-flag",
    "description": "Flag rows",
    "engineConfig": {
      "facets": [
        {
          "expression": "value",
          "invert": false,
          "selectError": false,
          "omitError": false,
          "name": "mess 1",
          "selectBlank": true,
          "columnName": "mess 1",
          "omitBlank": false,
          "type": "list",
          "selection": []
        }
      ],
      "mode": "row-based"
    },
    "flagged": true
  },
  {
    "op": "core/row-removal",
    "description": "Remove rows",
    "engineConfig": {
      "facets": [
        {
          "expression": "row.flagged",
          "invert": false,
          "selectError": false,
          "omitError": false,
          "name": "Flagged Rows",
          "selectBlank": false,
          "columnName": "",
          "omitBlank": false,
          "type": "list",
          "selection": [
            {
              "v": {
                "v": true,
                "l": "true"
              }
            }
          ]
        }
      ],
      "mode": "row-based"
    }
  },
  {
    "op": "core/transpose-columns-into-rows",
    "description": "Transpose cells in columns starting with mess 1 into rows in one new column named post",
    "startColumnName": "mess 1",
    "columnCount": -1,
    "ignoreBlankCells": true,
    "fillDown": true,
    "combinedColumnName": "post",
    "prependColumnName": false,
    "separator": ":"
  },
  {
    "op": "core/column-split",
    "description": "Split column post by separator",
    "engineConfig": {
      "facets": [],
      "mode": "row-based"
    },
    "columnName": "post",
    "guessCellType": true,
    "removeOriginalColumn": true,
    "mode": "separator",
    "separator": "||",
    "regex": false,
    "maxColumns": 0
  },
  {
    "op": "core/column-rename",
    "description": "Rename column Column to Page",
    "oldColumnName": "Column",
    "newColumnName": "Page"
  },
  {
    "op": "core/column-rename",
    "description": "Rename column post 1 to Post_text",
    "oldColumnName": "post 1",
    "newColumnName": "Post_text"
  },
  {
    "op": "core/column-rename",
    "description": "Rename column post 2 to Post_ID",
    "oldColumnName": "post 2",
    "newColumnName": "Post_ID"
  },
  {
    "op": "core/column-rename",
    "description": "Rename column post 3 to Post_time",
    "oldColumnName": "post 3",
    "newColumnName": "Post_time"
  }
]

The result would be a file with four columns – the page name, post ID, post text and post time. Export the project as a csv file.

Now, we need to get specific users for all posts. I use R for this activity. This is the basic code for getting data from Facebook API. However, you need to make several modifications.

setwd("YOUR DIRECTORY") 
install.packages("devtools") 
library(devtools)
install.packages("Rfacebook")
install.packages("lubridate")
install.packages("rpart")
install.packages("stringr")
library(Rfacebook)
library(lubridate)
library(rpart)
library(stringr)
require("Rfacebook")

token <- fbOAuth(app_id="YOUR ID", app_secret="YOUR SECRET",extended_permissions = TRUE) 

database <- read.csv("YOUR FILE.csv")

guardian1 <- database[database$Name == 'guardian',]
guardian1[guardian1==""] <- NA
guardian <- na.omit(guardian1)
guardian_likers <- NULL
guardian_commenters <- NULL
for (i in 1:nrow(guardian)){
post_guardian <- getPost(guardian[i,]$post.id, token=token, n = 7000, likes = TRUE, comments = TRUE)
# may use 'from_id' instead of 'from_name' here
guardian_commenters <- c(guardian_commenters, post_guardian$comments$from_id)
guardian_likers <- c(guardian_likers, post_guardian$likes$from_id)
}
write.table(guardian_likers, file = "guardian_likers.csv", sep = ",", col.names = NA, qmethod = "double")
write.table(guardian_commenters, file = "guardian_commenters.csv", sep = ",", col.names = NA, qmethod = "double")

 

You need to modify your working directory, your id, your token and your name of the file.

If you execute the code, the result is the csv file in your default folder. From here, you may work with the list of IDs of likers or commenters of your or any other public Facebook page.


Warning: Use of undefined constant rand - assumed 'rand' (this will throw an Error in a future version of PHP) in /home/ay015300/_sub_boros/martin/wp-content/themes/greenchilli/single.php on line 47

Leave a Reply

Your email address will not be published. Required fields are marked *