logo
down
shadow

Pig get distinct rows with counts


Pig get distinct rows with counts

By : rhladr
Date : November 22 2020, 10:56 AM
Hope this helps I have a pig table (called table1) containing many duplicates and more than one column (called col1, col2) , Please try the below:
code :
A = LOAD 'data'....;
GR = GROUP A by (col1,col2);
CNT = FOREACH GR GENERATE FLATTEN (group) AS (col1,col2) , COUNT(A) as cnt_col;
dump CNT;


Share : facebook icon twitter icon
Can I use a single MySQL query to select distinct rows and then non-distinct rows if a limit hasn't been reached?

Can I use a single MySQL query to select distinct rows and then non-distinct rows if a limit hasn't been reached?


By : user2030755
Date : March 29 2020, 07:55 AM
I hope this helps . You can select a limited number of rows from the combination of the distinct values first, in a union with your non-unique query (which you also limit by the maximum row count you want to retrieve).
ie. select field1, field2, ... from (select distinct field1, field2, ... from ... UNION select field1, field2, ... from ... LIMIT MAX_ROW_COUNT) AS total LIMIT MAX_ROW_COUNT
Counts rows returned by query that uses distinct

Counts rows returned by query that uses distinct


By : Zohre Gorji
Date : March 29 2020, 07:55 AM
this will help Your query is straight to the point, does the job, and it's simple enough, I'm sure you can bake some unnecessary rocket science into it, but would be overblown imho. Aside from what you have, you can use a group by like below to illustrate what I mean, but you will be basically doing the same thing, getting the uniques and counting them.
code :
SELECT COUNT(1)
FROM (SELECT a
        FROM MyTable
        WHERE a = 'a'
        GROUP BY a, b, c) Temp
Getting individual counts of last three distinct rows in column of data retrieved from multiple tables

Getting individual counts of last three distinct rows in column of data retrieved from multiple tables


By : alaveh helper
Date : March 29 2020, 07:55 AM
may help you . I have a query which returns several rows of data (in datetime format) of a single column obtained by performing JOINS on multiple SQL Tables. The Data obtained is a DateTime type and now I just want the individual count of latest three dates probably the count of lat three distinct dates as it sorted from earliest to latest. , I would do this with top and group by:
code :
SELECT TOP 3 ST.EffectiveDate, COUNT(*) as cnt
FROM Person.Contact C INNER JOIN
     Sales.SalesPerson SP
     ON C.ContactID = SP.SalesPersonID FULL OUTER JOIN
     Sales.SalesTerritory ST
     ON ST.TerritoryID = SP.TerritoryID
GROUP BY ST.EffectiveDate
ORDER BY ST.EffectiveDate DESC
Find distinct values, not distinct counts in elasticsearch

Find distinct values, not distinct counts in elasticsearch


By : Arman Ali
Date : March 29 2020, 07:55 AM
it fixes the issue Use a terms aggregation on the color field. And you need to pay attention to how that field you want to get distinct values on is analyzed, meaning you need to make sure you're not tokenizing it while indexing, otherwise every entry in the aggregation will be a different term that is part of the field content.
If you still want tokenization AND to use the terms aggregation you might want to look at not_analyzed type of indexing for that field, and maybe use multi fields.
code :
GET /cars/transactions/_search?search_type=count
{
  "aggs": {
    "distinct_colors": {
      "terms": {
        "field": "color",
        "size": 1000
      }
    }
  }
}
How to find number of distinct phones per customer and put the customers(counts) in different buckets as per the counts?

How to find number of distinct phones per customer and put the customers(counts) in different buckets as per the counts?


By : Anthony M
Date : March 29 2020, 07:55 AM
hope this fix your issue Below is the table where I have customer_id and different phones they have. , Use two aggregations:
code :
select cnt, count(*), min(customer_id), max(customer_id)
from (select customer_id, count(distinct phone_number) as cnt
      from customer_phone
      group by customer_id
     ) c
group by cnt
order by cnt;
Related Posts Related Posts :
  • How to show downshift + popper on top of material-ui dialog?
  • jQuery file upload and RequireJS configuration
  • How to send the result of a select query to a message body of a mail in oracle 10G
  • Worklight common build failing with "Failed to update main HTML file"
  • pg_listening_channels() is not returning the channels name
  • Asset management in ZF2
  • Does the Firefox add-on sdk allow direct modification of the http response byte stream?
  • How to remove menu hardware key from your android app
  • Identifying programming language
  • Use shell commands to find Makefile.am in configure.ac
  • Mono Compiler as Service or Microsoft Roselyn for a vb parser
  • How to add extra root nodes for not well formed XML structure?
  • which Uncrustify setting replaces blank lines with indenting spaces?
  • mac OSX Lion Homebrew install curl (77)
  • In Project Euler 47, why is 2^2 considered a prime number distinct from 2?
  • browserstack requesting localhost:45691
  • What was the real reason why Google is chosing RenderScript instead of OpenCL?
  • Mandrill Inbound Email routing
  • Prevent checkElementIndex() Guava function from concatenating additional response to existing error message
  • Arduino and Raspberry Pi Serial communication + multiple variables
  • convert a 960 grid based site to responsive
  • Should it be possible to have more than one DocuSign account (DEMO) with the same email address?
  • Is it possible to limit ammount of concurrent builds in Travis-CI
  • Selecting languages with specific ISO code
  • Deprecated vs Unsupported SDK
  • Verifying ClearCase files have been labeled properly
  • What's the difference between "Bag of Words" and "Bag of features" in computer vision?
  • Is there a way to tell Serde to use a struct field as a map's key?
  • ld:framework not found sfml
  • nice, go-idiomatic way of using a shared map
  • IzPack ChmodInstallerListener.jar
  • Breaking down tasks of user stories between developer and QA
  • Dropwizard service not starting properly
  • How to override devise invitable actions
  • Coded UI. How can I change TimeOut in Find() method
  • Why when I click on the update button error TypeError: r is undefined happen?
  • Visio Component Diagram - Required Interface
  • Lucene: fast(er) to get docs in bulk?
  • can I use windows 8 font (Segoe UI)for my web app?
  • Using Flask Session in Gevent Socket-IO
  • Difference between recommended and suggested cookbooks
  • Dynamic Forms (Formsets) in Flask / WTForms?
  • Image Servlet doesn't want to show image in browser (FireFox, IE..) but in Eclipse browser works?
  • Logback - how to get each logger logging to a separate log file?
  • In Crystal Reports, how do I keep a row from printing if the value is null?
  • iOS 6 Audio multi-route - use external microphone AND internal speaker simultaneously
  • Adding Comments in JasperReports template (jrxml)
  • Servicestack mini profiler
  • Logging with Castle Windsor, the Logging Facility and log4net
  • Subscribe for instances list update in GCE autoscaled group
  • Generating a unique QR code for each Order in Woocommerce
  • libvirtError: internal error Cannot find suitable CPU model for given data
  • CKEditor changing content automatically
  • Pass double pointer in a struct to CUDA
  • Apache Camel for TCP based streamer
  • How to convert old CDC mainframe PRUs to bytes?
  • compilation on Blue Gene Q - ELF header
  • Set border of a jasper subreport having a title and a detail band
  • How to highlight tpl file?
  • cygwin: Starting cron as a service (access denied)
  • shadow
    Privacy Policy - Terms - Contact Us © ourworld-yourmove.org