Annotated Papers

Entity Resolution (collection of papers courtesy Jay Pujara)

Aizawa-wiri2005_FastLinkageDetectionSchemeForMultiSourceInfoIntegration

Ananthakrishna-VLDB02_EliminatingFuzzyDuplicates

Arasu-VLDB06_EfficientExactSetSimJoins

Baxter-kdd03_ComparisonofFastBlockingMethodsRecordLinkage

Becker-wsdm10-LearningSimilarityMetricsForEventIDInSocialMedia

Benjelloun-07_DSwooshBenjelloun-VLDBxx_SwooshGenericApproachToER

bhattacharya-kdd06_QueryTimeER

bhattacharya-tkddxx_CollectiveERInRelationalData

bhattacharya-thesis-06

pujara-starai16

Bilenko-icdm06_AdaptiveBlockingLearningToScaleRecordLinkage

Bilenko-kdd03_AdaptiveDupDetectionUsingLearnableStringSim

Cohen-kdd02_LearningToMatchAndClusterLargeHighDimDataSets

Cohen-MFIR01_LearningToMatchAndClusterEntityNames

Culotta-aaai07_AuthorDisambiguationUsingErrorDrivenMLWithRankingLoss

Culotta-hlt07_FirstOrderProbModelsForCorefResolution

Culotta-kdd07_CanonicalizationOfDBRecordsUsingAdaptiveSimMeasures

Domingos-xx_MultiRelationalRecordLinkage

Dong-sigmod05_RefReconciliationInComplexInfoSpaces

druck07reducing

Elmagarmid_DuplicateRecordDetectionSurvey

Elmagarmid-tkde2007_DuplicateRecordDetectionSurvey

Ertoz-SIAMxx_FindingClustersOfDifferentSizesShapesInHighDims

LibenNovell-jasist07_LinkPredictionForSocialNetworks

Koudas-sigmod06_RecordLinkageTutorial

Kopcke-vldb10_EvaluationOfEntityResolutionApproaches

Kopcke-dke09_FrameworksForEntityMatchingAComparison

Kanani_ResourceBoundedInfoGatheringForCorrClustering

Kanani-pakdd10_ResourceBoundedInfoExtractionAcquiringMissingFeaturesOnDemand

Kanani-aaai07_EfficientStrategiesForImprovingPartitioningBasedAuthorCorefIncorporatingWebPagesNodes

Huang-pkdd06_EfficienttNameDisambiguationForLargeDBs

Han-JCDL04_TwoSupervisedLearningApproachesForNameDisambiguationAuthorCitations

Hadjieleftheriou-icde08_FastIndexesAndAlgosForSetSimSelectionQueries

Guha-VLDB04_MergingResultsApproximateMatchOps

Yao-kdd09_EfficientMethodsTopicModelInferenceStreamingDocumentCollections

Yao-emnlp10_CollectiveCrossDocRelationExtractionWithoutLabelledData

Winkler_OverviewOfRecordLinkage

Wick-VLDB08_DiscriminativeApproachToOntologyMapping

Wick-TR09_AdvancesLearningInferencePartitionwiseModelsofCorefResolution

Wick-KDD08_UnifiedApproachSchemaMatchingCorefCanonicalization

Whang-VLDB10_ERWithEvolvingRules

Whang-TRxx_DisinformationTechniquesForER

Whang-TR_PayAsYouGoER

Whang-sigmod09-ERWithIterativeBlocking

Whang-ICDE12_JointER

Wang-nips06_GroupAndTopicDiscoveryFromRelationsAndAttributes

Verykios-xx_AutomatingTheApproximateRecordMatchingProcess

Verbeek-dmkd04_AcceleratedEMClusteringOfLargeData

Treeratpituk-JCDL09_DisambiguationAuthorsInAcademicPubsUsingRandomForests

Thiesson-TR01_AcceleratingEMForLargeDBs

Singh-xx_DistantlyLabelingDataForLargeScaleCrossDocCoref

Singh-ACL11_LargeScaleCrossDocCorefUsingDistributedInferenceAndHierarchicalModels

Sarawagi-kdd02_InteractiveDedupUsingActiveLearning

rosenzvi-TIS10_LearningAuthorTopicModelsFromTextCorpora

Ravikumar-UAI04-HierarchicalGraphicalModelForRecordLinkage

papadakis-wsdm11_EfficientERForLargeHeteroInfoSpaces

McCallum-kdd00_EfficientClusteringofHighDimensioalDataSets

McCallum-ijcaiws2003_ConditionalModelsOfIdentityUncertainty

Goldberg-KDD95_RestructuringDBsForKnowledgeDiscoveryByConsolidationAndLinkFormation

12-DasSarma-CBLOCK

16-Efthymiou-BenchmarkingBlockingAlgoForWebEntities

Entity Resolution in the Web of Data

Parallel or Distributed Entity Resolution

2007-Benjelloun-DSwoosh

11-Niu-Felix

17-Efthymiou-ParallelMetablocking

2006-Kawai-PSwoosh

11-DalBianco-FastApproachForParallelDedupe

14-Malhotra-GraphParallelERUsingLSHandIMM

10-Kirsten-DataPartitioningForParallelEntityMatching

09-Whang-NegRules

07-Kim,Lee-ParallelLinkage

ADMM, Distributed ADMM, and Large-Scale Graph Processing

zhange-async-distr-admm-ICML14

pujara-thesis15

martins-AugLag-ICML11

gabay75

boyd-ADMM-2011

bach-NIPS12

bach-JMLR

pregel_paper

graphx

Graphlab

Hypergraph-Partitioned Vertex Programming Approach

one-trillion-edges-graph-processing-at-facebook-scale

Valiant-BulkSynchronousParallelModel

Distributed Systems

A-note-on-distributed-computing (Jim Waldo, Geoff Wyant, Ann Wollrath, Sam Kendall)

Implementing-remote-procedure-calls (Andrew Birrell and Bruce Nelson)

Dynamo: Amazon’s Highly Available Key-value Store (DeCandia, Hastroun, et al)

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications (Stoica, Morris, et al)

Using Lightweight Modeling To Understand Chord (Pamela Zave)

Impossibility of Distributed Consensus with One Faulty Process (Fischer, Lynch, Paterson)

Using reasoning about knowledge to analyze distributed systems (Halpern)

Lineage Driven Fault Injection (Alvaro et al)

Other

ProbabilisticProgrammingConcepts (De Raedt, Kimmig)

Advertisements