Home Parsing stops with Akka Streams mapAsync

# Parsing stops with Akka Streams mapAsync

user6719
1#
user6719 Published in September 21, 2018, 8:02 am

I am parsing 50000 records which contain their titles and URLs on the web page. While parsing, I am writing them to the database, which is PostgreSQL. I deployed my application using docker-compose. However, it keeps stopping on some page without any reason. I tried to write some logs to figure out what's happening, but there is no connection error or anything like that.

Here is my code for parsing and writing to the database:

object App {
val browser = JsoupBrowser()
val catRepo = new CategoryRepo(db)
val torrentRepo = new TorrentRepo(db)
val torrentForParseRepo = new TorrentForParseRepo(db)
val parallelismFactor = 10
val groupFactor = 10
implicit val system = ActorSystem("TolokaParser")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher

def parseAndWriteTorrentsForParseToDb(doc: App.browser.DocumentType) = {
Source(getRecordsLists(doc))
.grouped(groupFactor)
.mapAsync(parallelismFactor) { torrentForParse: Seq[TorrentForParse] =>
torrentForParseRepo.createInBatch(torrentForParse)
}
.runWith(Sink.ignore)
}

def getRecordsLists(doc: App.browser.DocumentType) = {
val pages = generatePagesFromHomePage(doc)
println(pages.size)
val result = for {
page <- pages
} yield {
println(s"Parsing torrent list...\$page")
println(tmp.size)
tmp
}
result flatten
}

}


What may be the cause of such problems?

• From the .runWith(Sink.ignore), you got a Future. Do you handle if there is any exception? – Luca T. Feb 13 at 20:01